Search Results for author: Lei Sun

Found 58 papers, 18 papers with code

Detect-Order-Construct: A Tree Construction based Approach for Hierarchical Document Structure Analysis

no code implementations22 Jan 2024 Jiawei Wang, Kai Hu, Zhuoyao Zhong, Lei Sun, Qiang Huo

Our end-to-end system achieves state-of-the-art performance on two large-scale document layout analysis datasets (PubLayNet and DocLayNet), a high-quality hierarchical document structure reconstruction dataset (HRDoc), and our Comp-HRDoc benchmark.

Document Layout Analysis Document Summarization +4

UniVIE: A Unified Label Space Approach to Visual Information Extraction from Form-like Documents

no code implementations17 Jan 2024 Kai Hu, Jiawei Wang, WeiHong Lin, Zhuoyao Zhong, Lei Sun, Qiang Huo

This unified approach allows for the definition of various relation types and effectively tackles hierarchical relationships in form-like documents.

Key Information Extraction Relation

Dynamic Relation Transformer for Contextual Text Block Detection

no code implementations17 Jan 2024 Jiawei Wang, Shunchi Zhang, Kai Hu, Chixiang Ma, Zhuoyao Zhong, Lei Sun, Qiang Huo

Contextual Text Block Detection (CTBD) is the task of identifying coherent text blocks within the complexity of natural scenes.

Graph Generation Relation +1

Perceptual Quality Assessment for Video Frame Interpolation

no code implementations25 Dec 2023 Jinliang Han, Xiongkuo Min, Yixuan Gao, Jun Jia, Lei Sun, Zuowei Cao, Yonglin Luo, Guangtao Zhai

To evaluate the quality of VFI frames without reference videos, a no-reference perceptual quality assessment method is proposed in this paper.

Image Quality Assessment Video Frame Interpolation

The USTC-NERCSLIP Systems for the CHiME-7 DASR Challenge

no code implementations28 Aug 2023 Ruoyu Wang, Maokui He, Jun Du, Hengshun Zhou, Shutong Niu, Hang Chen, Yanyan Yue, Gaobin Yang, Shilong Wu, Lei Sun, Yanhui Tu, Haitao Tang, Shuangqing Qian, Tian Gao, Mengzhi Wang, Genshun Wan, Jia Pan, Jianqing Gao, Chin-Hui Lee

This technical report details our submission system to the CHiME-7 DASR Challenge, which focuses on speaker diarization and speech recognition under complex multi-speaker scenarios.

speaker-diarization Speaker Diarization +2

Key Gene Mining in Transcriptional Regulation for Specific Biological Processes with Small Sample Sizes Using Multi-network pipeline Transformer

no code implementations7 Aug 2023 Kerui Huang, Jianhong Tian, Lei Sun, Li Zeng, Peng Xie, Aihua Deng, Ping Mo, Zhibo Zhou, Ming Jiang, Yun Wang, Xiaocheng Jiang

Gene mining is an important topic in the field of life sciences, but traditional machine learning methods cannot consider the regulatory relationships between genes.

Data Augmentation

Minimalist and High-Quality Panoramic Imaging with PSF-aware Transformers

1 code implementation22 Jun 2023 Qi Jiang, Shaohua Gao, Yao Gao, Kailun Yang, Zhonghua Yi, Hao Shi, Lei Sun, Kaiwei Wang

In this paper, we propose a Panoramic Computational Imaging Engine (PCIE) to address minimalist and high-quality panoramic imaging.

Super-Resolution

A Question-Answering Approach to Key Value Pair Extraction from Form-like Document Images

no code implementations17 Apr 2023 Kai Hu, Zhuoyuan Wu, Zhuoyao Zhong, WeiHong Lin, Lei Sun, Qiang Huo

In this paper, we present a new question-answering (QA) based key-value pair extraction approach, called KVPFormer, to robustly extracting key-value relationships between entities from form-like document images.

Question Answering

Robust Table Structure Recognition with Dynamic Queries Enhanced Detection Transformer

no code implementations21 Mar 2023 Jiawei Wang, WeiHong Lin, Chixiang Ma, Mingze Li, Zheng Sun, Lei Sun, Qiang Huo

Unlike previous methods, we formulate table separation line prediction as a line regression problem instead of an image segmentation problem and propose a new two-stage dynamic queries enhanced DETR based separation line regression approach, named DQ-DETR, to predict separation lines from table images directly.

Image Segmentation regression +2

Improving Fast Auto-Focus with Event Polarity

no code implementations15 Mar 2023 Yuhan Bao, Lei Sun, Yuqin Ma, Diyang Gu, Kaiwei Wang

Specifically, the symmetrical relationship between the event polarities in focusing is investigated, and the event-based focus evaluation function is proposed based on the principles of the event cameras and the imaging model in the focusing process.

Computational Optics Meet Domain Adaptation: Transferring Semantic Segmentation Beyond Aberrations

no code implementations21 Nov 2022 Qi Jiang, Hao Shi, Shaohua Gao, Jiaming Zhang, Kailun Yang, Lei Sun, Kaiwei Wang

Further, we propose Computational Imaging Assisted Domain Adaptation (CIADA) to leverage prior knowledge of CI for robust performance in SSOA.

Scene Understanding Semantic Segmentation +1

Efficient and Accurate Quantized Image Super-Resolution on Mobile NPUs, Mobile AI & AIM 2022 challenge: Report

2 code implementations7 Nov 2022 Andrey Ignatov, Radu Timofte, Maurizio Denna, Abdel Younes, Ganzorig Gankhuyag, Jingang Huh, Myeong Kyun Kim, Kihwan Yoon, Hyeon-Cheol Moon, Seungho Lee, Yoonsik Choe, Jinwoo Jeong, Sungjei Kim, Maciej Smyl, Tomasz Latkowski, Pawel Kubik, Michal Sokolski, Yujie Ma, Jiahao Chao, Zhou Zhou, Hongfan Gao, Zhengfeng Yang, Zhenbing Zeng, Zhengyang Zhuge, Chenghua Li, Dan Zhu, Mengdi Sun, Ran Duan, Yan Gao, Lingshun Kong, Long Sun, Xiang Li, Xingdong Zhang, Jiawei Zhang, Yaqi Wu, Jinshan Pan, Gaocheng Yu, Jin Zhang, Feng Zhang, Zhe Ma, Hongbin Wang, Hojin Cho, Steve Kim, Huaen Li, Yanbo Ma, Ziwei Luo, Youwei Li, Lei Yu, Zhihong Wen, Qi Wu, Haoqiang Fan, Shuaicheng Liu, Lize Zhang, Zhikai Zong, Jeremy Kwon, Junxi Zhang, Mengyuan Li, Nianxiang Fu, Guanchen Ding, Han Zhu, Zhenzhong Chen, Gen Li, Yuanfan Zhang, Lei Sun, Dafeng Zhang, Neo Yang, Fitz Liu, Jerry Zhao, Mustafa Ayazoglu, Bahri Batuhan Bilecen, Shota Hirose, Kasidis Arunruangsirilert, Luo Ao, Ho Chun Leung, Andrew Wei, Jie Liu, Qiang Liu, Dahai Yu, Ao Li, Lei Luo, Ce Zhu, Seongmin Hong, Dongwon Park, Joonhee Lee, Byeong Hyun Lee, Seunggyu Lee, Se Young Chun, Ruiyuan He, Xuhao Jiang, Haihang Ruan, Xinjian Zhang, Jing Liu, Garas Gendy, Nabil Sabor, Jingchao Hou, Guanghui He

While numerous solutions have been proposed for this problem in the past, they are usually not compatible with low-power mobile NPUs having many computational and memory constraints.

Image Super-Resolution

Robust Human Matting via Semantic Guidance

1 code implementation11 Oct 2022 Xiangguang Chen, Ye Zhu, Yu Li, Bingtao Fu, Lei Sun, Ying Shan, Shan Liu

Unlike previous works, our framework is data efficient, which requires a small amount of matting ground-truth to learn to estimate high quality object mattes.

Image Matting Segmentation

TSRFormer: Table Structure Recognition with Transformers

no code implementations9 Aug 2022 WeiHong Lin, Zheng Sun, Chixiang Ma, Mingze Li, Jiawei Wang, Lei Sun, Qiang Huo

We present a new table structure recognition (TSR) approach, called TSRFormer, to robustly recognizing the structures of complex tables with geometrical distortions from various table images.

Ranked #2 on Table Recognition on PubTabNet (TEDS-Struct metric)

Image Segmentation Relation Network +2

Real Image Restoration via Structure-preserving Complementarity Attention

no code implementations28 Jul 2022 Yuanfan Zhang, Gen Li, Lei Sun

Since convolutional neural networks perform well in learning generalizable image priors from large-scale data, these models have been widely used in image denoising tasks.

Image Denoising Image Restoration +1

DETRs with Hybrid Matching

8 code implementations CVPR 2023 Ding Jia, Yuhui Yuan, Haodi He, Xiaopei Wu, Haojun Yu, WeiHong Lin, Lei Sun, Chao Zhang, Han Hu

One-to-one set matching is a key design for DETR to establish its end-to-end capability, so that object detection does not require a hand-crafted NMS (non-maximum suppression) to remove duplicate detections.

Object Detection Pose Estimation +2

FaceFormer: Scale-aware Blind Face Restoration with Transformers

no code implementations20 Jul 2022 Aijin Li, Gen Li, Lei Sun, Xintao Wang

Blind face restoration usually encounters with diverse scale face inputs, especially in the real world.

Blind Face Restoration

Multi-Task Learning Framework for Emotion Recognition in-the-wild

1 code implementation19 Jul 2022 Tenggan Zhang, Chuanhe Liu, Xiaolong Liu, Yuchen Liu, Liyu Meng, Lei Sun, Wenqiang Jiang, Fengyuan Zhang, Jinming Zhao, Qin Jin

This paper presents our system for the Multi-Task Learning (MTL) Challenge in the 4th Affective Behavior Analysis in-the-wild (ABAW) competition.

Emotion Recognition Multi-Task Learning +1

Annular Computational Imaging: Capture Clear Panoramic Images through Simple Lens

1 code implementation13 Jun 2022 Qi Jiang, Hao Shi, Lei Sun, Shaohua Gao, Kailun Yang, Kaiwei Wang

In this paper, we propose an Annular Computational Imaging (ACI) framework to break the optical limit of light-weight PAL design.

Image Restoration

Efficient Human Pose Estimation via 3D Event Point Cloud

1 code implementation9 Jun 2022 Jiaan Chen, Hao Shi, Yaozu Ye, Kailun Yang, Lei Sun, Kaiwei Wang

We then leverage the rasterized event point cloud as input to three different backbones, PointNet, DGCNN, and Point Transformer, with two linear layer decoders to predict the location of human keypoints.

3D Human Pose Estimation Edge-computing +1

NTIRE 2022 Challenge on Efficient Super-Resolution: Methods and Results

2 code implementations11 May 2022 Yawei Li, Kai Zhang, Radu Timofte, Luc van Gool, Fangyuan Kong, Mingxi Li, Songwei Liu, Zongcai Du, Ding Liu, Chenhui Zhou, Jingyi Chen, Qingrui Han, Zheyuan Li, Yingqi Liu, Xiangyu Chen, Haoming Cai, Yu Qiao, Chao Dong, Long Sun, Jinshan Pan, Yi Zhu, Zhikai Zong, Xiaoxiao Liu, Zheng Hui, Tao Yang, Peiran Ren, Xuansong Xie, Xian-Sheng Hua, Yanbo Wang, Xiaozhong Ji, Chuming Lin, Donghao Luo, Ying Tai, Chengjie Wang, Zhizhong Zhang, Yuan Xie, Shen Cheng, Ziwei Luo, Lei Yu, Zhihong Wen, Qi Wu1, Youwei Li, Haoqiang Fan, Jian Sun, Shuaicheng Liu, Yuanfei Huang, Meiguang Jin, Hua Huang, Jing Liu, Xinjian Zhang, Yan Wang, Lingshun Long, Gen Li, Yuanfan Zhang, Zuowei Cao, Lei Sun, Panaetov Alexander, Yucong Wang, Minjie Cai, Li Wang, Lu Tian, Zheyuan Wang, Hongbing Ma, Jie Liu, Chao Chen, Yidong Cai, Jie Tang, Gangshan Wu, Weiran Wang, Shirui Huang, Honglei Lu, Huan Liu, Keyan Wang, Jun Chen, Shi Chen, Yuchun Miao, Zimo Huang, Lefei Zhang, Mustafa Ayazoğlu, Wei Xiong, Chengyi Xiong, Fei Wang, Hao Li, Ruimian Wen, Zhijing Yang, Wenbin Zou, Weixin Zheng, Tian Ye, Yuncheng Zhang, Xiangzhen Kong, Aditya Arora, Syed Waqas Zamir, Salman Khan, Munawar Hayat, Fahad Shahbaz Khan, Dandan Gaoand Dengwen Zhouand Qian Ning, Jingzhu Tang, Han Huang, YuFei Wang, Zhangheng Peng, Haobo Li, Wenxue Guan, Shenghua Gong, Xin Li, Jun Liu, Wanjun Wang, Dengwen Zhou, Kun Zeng, Hanjiang Lin, Xinyu Chen, Jinsheng Fang

The aim was to design a network for single image super-resolution that achieved improvement of efficiency measured according to several metrics including runtime, parameters, FLOPs, activations, and memory consumption while at least maintaining the PSNR of 29. 00dB on DIV2K validation set.

Image Super-Resolution

Rethinking Classifier and Adversarial Attack

no code implementations4 May 2022 Youhuan Yang, Lei Sun, Leyu Dai, Song Guo, Xiuqing Mao, Xiaoqin Wang, Bayi Xu

Various defense models have been proposed to resist adversarial attack algorithms, but existing adversarial robustness evaluation methods always overestimate the adversarial robustness of these models (i. e., not approaching the lower bound of robustness).

Adversarial Attack Adversarial Robustness

CE-based white-box adversarial attacks will not work using super-fitting

no code implementations4 May 2022 Youhuan Yang, Lei Sun, Leyu Dai, Song Guo, Xiuqing Mao, Xiaoqin Wang, Bayi Xu

This is especially dangerous for some systems with high-security requirements, so this paper proposes a new defense method by using the model super-fitting state to improve the model's adversarial robustness (i. e., the accuracy under adversarial attacks).

Adversarial Attack Adversarial Robustness

IMOT: General-Purpose, Fast and Robust Estimation for Spatial Perception Problems with Outliers

no code implementations4 Apr 2022 Lei Sun

Spatial perception problems are the fundamental building blocks of robotics and computer vision.

Point Cloud Registration

Robust Table Detection and Structure Recognition from Heterogeneous Document Images

no code implementations17 Mar 2022 Chixiang Ma, WeiHong Lin, Lei Sun, Qiang Huo

We introduce a new table detection and structure recognition approach named RobusTabNet to detect the boundaries of tables and reconstruct the cellular structure of each table from heterogeneous document images.

Ranked #5 on Table Recognition on PubTabNet (TEDS-Struct metric)

Region Proposal Table Detection +1

Event-Based Fusion for Motion Deblurring with Cross-modal Attention

1 code implementation30 Nov 2021 Lei Sun, Christos Sakaridis, Jingyun Liang, Qi Jiang, Kailun Yang, Peng Sun, Yaozu Ye, Kaiwei Wang, Luc van Gool

Traditional frame-based cameras inevitably suffer from motion blur due to long exposure times.

Ranked #3 on Deblurring on GoPro (using extra training data)

Deblurring Image Deblurring +1

Persia: An Open, Hybrid System Scaling Deep Learning-based Recommenders up to 100 Trillion Parameters

1 code implementation10 Nov 2021 Xiangru Lian, Binhang Yuan, XueFeng Zhu, Yulong Wang, Yongjun He, Honghuan Wu, Lei Sun, Haodong Lyu, Chengjun Liu, Xing Dong, Yiqiao Liao, Mingnan Luo, Congfei Zhang, Jingru Xie, Haonan Li, Lei Chen, Renjie Huang, Jianying Lin, Chengchun Shu, Xuezhong Qiu, Zhishan Liu, Dongying Kong, Lei Yuan, Hai Yu, Sen yang, Ce Zhang, Ji Liu

Specifically, in order to ensure both the training efficiency and the training accuracy, we design a novel hybrid training algorithm, where the embedding layer and the dense neural network are handled by different synchronization mechanisms; then we build a system called Persia (short for parallel recommendation training system with hybrid acceleration) to support this hybrid training algorithm.

Recommendation Systems

Practical, Fast and Robust Point Cloud Registration for 3D Scene Stitching and Object Localization

no code implementations8 Nov 2021 Lei Sun

3D point cloud registration ranks among the most fundamental problems in remote sensing, photogrammetry, robotics and geometric computer vision.

3D Feature Matching Benchmarking +2

TriVoC: Efficient Voting-based Consensus Maximization for Robust Point Cloud Registration with Extreme Outlier Ratios

no code implementations1 Nov 2021 Lei Sun, Lu Deng

Correspondence-based point cloud registration is a cornerstone in robotics perception and computer vision, which seeks to estimate the best rigid transformation aligning two point clouds from the putative correspondences.

Point Cloud Registration

DANIEL: A Fast and Robust Consensus Maximization Method for Point Cloud Registration with High Outlier Ratios

1 code implementation11 Oct 2021 Lei Sun

In this paper, we present a novel time-efficient RANSAC-type consensus maximization solver, named DANIEL (Double-layered sAmpliNg with consensus maximization based on stratIfied Element-wise compatibiLity), for robust registration.

Point Cloud Registration

Conditional DETR for Fast Training Convergence

3 code implementations ICCV 2021 Depu Meng, Xiaokang Chen, Zejia Fan, Gang Zeng, Houqiang Li, Yuhui Yuan, Lei Sun, Jingdong Wang

Our approach, named conditional DETR, learns a conditional spatial query from the decoder embedding for decoder multi-head cross-attention.

Object object-detection +1

Separation Guided Speaker Diarization in Realistic Mismatched Conditions

no code implementations6 Jul 2021 Shu-Tong Niu, Jun Du, Lei Sun, Chin-Hui Lee

We propose a separation guided speaker diarization (SGSD) approach by fully utilizing a complementarity of speech separation and speaker clustering.

Clustering speaker-diarization +2

On the Connection between Local Attention and Dynamic Depth-wise Convolution

1 code implementation ICLR 2022 Qi Han, Zejia Fan, Qi Dai, Lei Sun, Ming-Ming Cheng, Jiaying Liu, Jingdong Wang

Sparse connectivity: there is no connection across channels, and each position is connected to the positions within a small local window.

object-detection Object Detection +2

ViBERTgrid: A Jointly Trained Multi-Modal 2D Document Representation for Key Information Extraction from Documents

no code implementations25 May 2021 WeiHong Lin, Qifang Gao, Lei Sun, Zhuoyao Zhong, Kai Hu, Qin Ren, Qiang Huo

In this paper, we propose a new multi-modal backbone network by concatenating a BERTgrid to an intermediate layer of a CNN model, where the input of CNN is a document image and the BERTgrid is a grid of word embeddings, to generate a more powerful grid-based document representation, named ViBERTgrid.

Image Segmentation Key Information Extraction +4

Aerial-PASS: Panoramic Annular Scene Segmentation in Drone Videos

no code implementations15 May 2021 Lei Sun, Jia Wang, Kailun Yang, Kaikai Wu, Xiangdong Zhou, Kaiwei Wang, Jian Bai

A lightweight panoramic annular semantic segmentation neural network model is designed to achieve high-accuracy and real-time scene parsing.

Scene Parsing Scene Segmentation +1

ICOS: Efficient and Highly Robust Rotation Search and Point Cloud Registration with Correspondences

no code implementations30 Apr 2021 Lei Sun

Rotation search and point cloud registration are two fundamental problems in robotics and computer vision, which aim to estimate the rotation and the transformation between the 3D vector sets and point clouds, respectively.

Point Cloud Registration

RANSIC: Fast and Highly Robust Estimation for Rotation Search and Point Cloud Registration using Invariant Compatibility

no code implementations19 Apr 2021 Lei Sun

Correspondence-based rotation search and point cloud registration are two fundamental problems in robotics and computer vision.

Point Cloud Registration

IRON: Invariant-based Highly Robust Point Cloud Registration

no code implementations7 Mar 2021 Lei Sun

Once the scale is estimated, our second contribution is to relax the non-convex global registration problem into a convex Semi-Definite Program (SDP) in a certifiable way using Sum-of-Squares (SOS) Relaxation and show that the relaxation is tight.

Point Cloud Registration Translation

The Post-impact Evolution of the X-ray Emitting Gas in SNR 1987A Viewed by XMM-Newton

no code implementations5 Mar 2021 Lei Sun, Jacco Vink, Yang Chen, Ping Zhou, Dmitry Prokhorov, Gerd Puhlhofer, Denys Malyshev

In the last few years, the emission measure of the low-temperature plasma has been decreasing, indicating that the blast wave has left the main ER.

High Energy Astrophysical Phenomena

P-KDGAN: Progressive Knowledge Distillation with GANs for One-class Novelty Detection

no code implementations14 Jul 2020 Zhiwei Zhang, Shifeng Chen, Lei Sun

The progressive learning of knowledge distillation is a two-step approach that continuously improves the performance of the student GAN and achieves better performance than single step methods.

Knowledge Distillation Novelty Detection +1

Adaptive Operator Selection Based on Dynamic Thompson Sampling for MOEA/D

no code implementations22 Apr 2020 Lei Sun, Ke Li

In particular, each arm of our bandit learning model represents a reproduction operator and is assigned with a prior reward distribution.

Thompson Sampling

ReLaText: Exploiting Visual Relationships for Arbitrary-Shaped Scene Text Detection with Graph Convolutional Networks

no code implementations16 Mar 2020 Chixiang Ma, Lei Sun, Zhuoyao Zhong, Qiang Huo

The key idea is to decompose text detection into two subproblems, namely detection of text primitives and prediction of link relationships between nearby text primitive pairs.

Link Prediction Region Proposal +4

Adversarial Balancing-based Representation Learning for Causal Effect Inference with Observational Data

2 code implementations30 Apr 2019 Xin Du, Lei Sun, Wouter Duivesteijn, Alexander Nikolaev, Mykola Pechenizkiy

The challenges for this problem are two-fold: on the one hand, we have to derive a causal estimator to estimate the causal quantity from observational data, where there exists confounding bias; on the other hand, we have to deal with the identification of CATE when the distribution of covariates in treatment and control groups are imbalanced.

Causal Inference Representation Learning +2

Solving the Empirical Bayes Normal Means Problem with Correlated Noise

1 code implementation18 Dec 2018 Lei Sun, Matthew Stephens

The Normal Means problem plays a fundamental role in many areas of modern high-dimensional statistics, both in theory and practice.

Mask R-CNN with Pyramid Attention Network for Scene Text Detection

no code implementations22 Nov 2018 Zhida Huang, Zhuoyao Zhong, Lei Sun, Qiang Huo

In this paper, we present a new Mask R-CNN based text detection approach which can robustly detect multi-oriented and curved text from natural scene images in a unified manner.

Curved Text Detection Text Detection

Gradient-Free Learning Based on the Kernel and the Range Space

no code implementations27 Oct 2018 Kar-Ann Toh, Zhiping Lin, Zhengguo Li, Beomseok Oh, Lei Sun

In this article, we show that solving the system of linear equations by manipulating the kernel and the range space is equivalent to solving the problem of least squares error approximation.

Deterministic Stretchy Regression

no code implementations9 Jun 2018 Kar-Ann Toh, Lei Sun, Zhiping Lin

An extension of the regularized least-squares in which the estimation parameters are stretchable is introduced and studied in this paper.

regression

An Anchor-Free Region Proposal Network for Faster R-CNN based Text Detection Approaches

no code implementations24 Apr 2018 Zhuoyao Zhong, Lei Sun, Qiang Huo

The anchor mechanism of Faster R-CNN and SSD framework is considered not effective enough to scene text detection, which can be attributed to its IoU based matching criterion between anchors and ground-truth boxes.

Region Proposal Scene Text Detection +1

Bayesian $l_0$-regularized Least Squares

no code implementations31 May 2017 Nicholas G. Polson, Lei Sun

To illustrate our methodology, we provide simulation evidence and a real data example on the statistical properties and computational efficiency of SBR versus direct posterior sampling using spike-and-slab priors.

Computational Efficiency Variable Selection

Constrained Maximum Correntropy Adaptive Filtering

no code implementations6 Oct 2016 Siyuan Peng, Badong Chen, Lei Sun, Zhiping Lin, Wee Ser

Most existing constrained adaptive filtering algorithms are developed under mean square error (MSE) criterion, which is an ideal optimality criterion under Gaussian noises.

Cannot find the paper you are looking for? You can Submit a new open access paper.