no code implementations • 29 Oct 2024 • Renze Lou, Hanzi Xu, Sijia Wang, Jiangshu Du, Ryo Kamoi, Xiaoxin Lu, Jian Xie, Yuxuan Sun, Yusen Zhang, Jihyun Janice Ahn, Hongchao Fang, Zhuoyang Zou, Wenchao Ma, Xi Li, Kai Zhang, Congying Xia, Lifu Huang, Wenpeng Yin
Numerous studies have assessed the proficiency of AI systems, particularly large language models (LLMs), in facilitating everyday tasks such as email writing, question answering, and creative content generation.
no code implementations • 23 Oct 2024 • Xiaohuan Bi, Xi Li
Federated learning (FL) enables decentralized model training while preserving privacy.
1 code implementation • 21 Oct 2024 • Guangcong Zheng, Teng Li, Rui Jiang, Yehao Lu, Tao Wu, Xi Li
We innovatively associate the quality of a condition with its ability to reduce uncertainty and interpret noisy cross-frame features as a form of noisy condition.
no code implementations • 11 Oct 2024 • Shengyu Hao, Wenhao Chai, Zhonghan Zhao, Meiqi Sun, Wendi Hu, Jieyang Zhou, Yixian Zhao, Qi Li, Yizhou Wang, Xi Li, Gaoang Wang
Addressing this issue, this paper introduces a novel zero-shot approach for the 3D reconstruction and tracking of all objects from the ego-centric video.
no code implementations • 10 Oct 2024 • Xu Yao, Xiaoxu Wu, Xi Li, Huan Xu, Chenlei Li, Ping Huang, Si Li, Xiaoning Ma, Jiulong Shan
Manufacturing quality audits are pivotal for ensuring high product standards in mass production environments.
1 code implementation • 14 Sep 2024 • Zimeng Fang, Chao Liang, Xue Zhou, Shuyuan Zhu, Xi Li
Different from existing tracking-by-detection MOT methods, AED gets rid of prior knowledge (e. g. motion cues) and relies solely on highly robust feature learning to handle complex trajectories in OV-MOT tasks while keeping excellent performance in CV-MOT tasks.
Ranked #1 on Multi-Object Tracking on TAO (using extra training data)
no code implementations • 6 Sep 2024 • Weijie He, Mushui Liu, Yunlong Yu, Zheming Lu, Xi Li
Single-frame infrared small target (SIRST) detection poses a significant challenge due to the requirement to discern minute targets amidst complex infrared background clutter.
no code implementations • 23 Aug 2024 • Tao Wu, Yong Zhang, Xintao Wang, Xianpan Zhou, Guangcong Zheng, Zhongang Qi, Ying Shan, Xi Li
However, since it is only trained on static images, the fine-tuning process of subject learning disrupts abilities of video diffusion models (VDMs) to combine concepts and generate motions.
no code implementations • 22 Aug 2024 • Mushui Liu, Fangtai Wu, Bozheng Li, Ziqian Lu, Yunlong Yu, Xi Li
Few-shot learning (FSL) aims to recognize new concepts using a limited number of visual samples.
1 code implementation • 18 Jul 2024 • Lu Gan, Xi Li
Moreover, we used the YOLOv8n model to train these three datasets, the performance of the TXL-PBC dataset surpass the original two datasets.
no code implementations • 4 Jul 2024 • Huanzhang Dou, Pengyi Zhang, Yuhan Zhao, Lu Jin, Xi Li
To enhance the sensitivity to the walking pattern while maintaining the robustness of recognition, we present a Complementary Learning with neural Architecture Search (CLASH) framework, consisting of walking pattern sensitive gait descriptor named dense spatial-temporal field (DSTF) and neural architecture search based complementary learning (NCL).
no code implementations • 2 Jul 2024 • Huanzhang Dou, Ruixiang Li, Wei Su, Xi Li
In text-to-video (T2V) generation, significant attention has been directed toward its development, yet unifying discrete and continuous grounding conditions in T2V generation remains under-explored.
1 code implementation • 28 Jun 2024 • Longrong Yang, Dong Shen, Chaoxiang Cai, Fan Yang, Size Li, Di Zhang, Xi Li
The Mixture-of-Experts (MoE) has gained increasing attention in studying Large Vision-Language Models (LVLMs).
no code implementations • CVPR 2024 • Wei Su, Peihan Miao, Huanzhang Dou, Xi Li
This inspires us to explore a question: can we eliminate linguistic-irrelevant redundant visual regions to improve the efficiency of the model?
1 code implementation • 15 Jun 2024 • Yike Yuan, Huanzhang Dou, Fengjun Guo, Xi Li
This paper represents a neat yet effective framework, named SemanticMIM, to integrate the advantages of masked image modeling (MIM) and contrastive learning (CL) for general visual representation.
1 code implementation • CVPR 2024 • Wenjie Wang, Yehao Lu, Guangcong Zheng, Shuigen Zhan, Xiaoqing Ye, Zichang Tan, Jingdong Wang, Gaoang Wang, Xi Li
Vision-based roadside 3D object detection has attracted rising attention in autonomous driving domain, since it encompasses inherent advantages in reducing blind spots and expanding perception range.
no code implementations • 10 Jun 2024 • Xi Li, Yusen Zhang, Renze Lou, Chen Wu, Jiaqi Wang
Backdoor attacks present significant threats to Large Language Models (LLMs), particularly with the rise of third-party services that offer API integration and prompt engineering.
no code implementations • 7 Jun 2024 • Jie Deng, Wenhao Chai, Junsheng Huang, Zhonghan Zhao, Qixuan Huang, Mingyan Gao, Jianshu Guo, Shengyu Hao, Wenhao Hu, Jenq-Neng Hwang, Xi Li, Gaoang Wang
The rendered scenes lack variety, resembling the training images, resulting in monotonous styles.
1 code implementation • 26 May 2024 • Tianyi Bai, Hao Liang, Binwang Wan, Yanran Xu, Xi Li, Shiyu Li, Ling Yang, Bozhou Li, Yifan Wang, Bin Cui, Ping Huang, Jiulong Shan, Conghui He, Binhang Yuan, Wentao Zhang
Multimodal large language models (MLLMs) enhance the capabilities of standard large language models by integrating and processing data from multiple modalities, including text, vision, audio, video, and 3D environments.
1 code implementation • 17 May 2024 • Mushui Liu, Jun Dan, Ziqian Lu, Yunlong Yu, Yingming Li, Xi Li
In this paper, we propose CM-UNet, comprising a CNN-based encoder for extracting local image features and a Mamba-based decoder for aggregating and integrating global information, facilitating efficient semantic segmentation of remote sensing images.
1 code implementation • 26 Apr 2024 • Enxin Song, Wenhao Chai, Tian Ye, Jenq-Neng Hwang, Xi Li, Gaoang Wang
Recently, integrating video foundation models and large language models to build a video understanding system can overcome the limitations of specific pre-defined vision tasks.
Ranked #4 on Question Answering on NExT-QA (Open-ended VideoQA)
3 code implementations • 12 Apr 2024 • Xianhua He, Dashuang Liang, Song Yang, Zhanlong Hao, Hui Ma, Binjie Mao, Xi Li, Yao Wang, Pengfei Yan, Ajian Liu
SPSC and SDSC augment live samples into simulated attack samples by simulating spoofing clues of physical and digital attacks, respectively, which significantly improve the capability of the model to detect "unseen" attack types.
no code implementations • 15 Mar 2024 • Tao Wu, XueWei Li, Zhongang Qi, Di Hu, Xintao Wang, Ying Shan, Xi Li
Controllable spherical panoramic image generation holds substantial applicative potential across a variety of domains. However, it remains a challenging task due to the inherent spherical distortion and geometry characteristics, resulting in low-quality content generation. In this paper, we introduce a novel framework of SphereDiffusion to address these unique challenges, for better generating high-quality and precisely controllable spherical panoramic images. For the spherical distortion characteristic, we embed the semantics of the distorted object with text encoding, then explicitly construct the relationship with text-object correspondence to better use the pre-trained knowledge of the planar images. Meanwhile, we employ a deformable technique to mitigate the semantic deviation in latent space caused by spherical distortion. For the spherical geometry characteristic, in virtue of spherical rotation invariance, we improve the data diversity and optimization objectives in the training process, enabling the model to better learn the spherical geometry characteristic. Furthermore, we enhance the denoising process of the diffusion model, enabling it to effectively use the learned geometric characteristic to ensure the boundary continuity of the generated images. With these specific techniques, experiments on Structured3D dataset show that SphereDiffusion significantly improves the quality of controllable spherical image generation and relatively reduces around 35% FID on average.
1 code implementation • 22 Feb 2024 • Jiajie Su, Chaochao Chen, Zibin Lin, Xi Li, Weiming Liu, Xiaolin Zheng
To tackle these challenges, we propose a Personalized Behavior-Aware Transformer framework (PBAT) for MBSR problem, which models personalized patterns and multifaceted sequential collaborations in a novel way to boost recommendation performance.
no code implementations • 3 Feb 2024 • Xi Li, Hang Wang, David J. Miller, George Kesidis
A variety of defenses have been proposed against backdoors attacks on deep neural network (DNN) classifiers.
no code implementations • 2 Feb 2024 • Xi Li, Jiaqi Wang
Federated Learning (FL), while a breakthrough in decentralized machine learning, contends with significant challenges such as limited data availability and the variability of computational resources, which can stifle the performance and scalability of the models.
no code implementations • 18 Jan 2024 • Chen Wu, Xi Li, Jiaqi Wang
Federated Learning (FL) addresses critical issues in machine learning related to data privacy and security, yet suffering from data insufficiency and imbalance under certain circumstances.
no code implementations • CVPR 2024 • Jiahan Li, Jiuyang Dong, Shenjin Huang, Xi Li, Junjun Jiang, Xiaopeng Fan, Yongbing Zhang
Recently virtual staining technology has greatly promoted the advancement of histopathology.
1 code implementation • 21 Dec 2023 • Chunyang Cheng, Tianyang Xu, Xiao-Jun Wu, Hui Li, Xi Li, Zhangyong Tang, Josef Kittler
Advanced image fusion methods are devoted to generating the fusion results by aggregating the complementary information conveyed by the source images.
1 code implementation • 13 Sep 2023 • Yaoting Wang, Weisong Liu, Guangyao Li, Jian Ding, Di Hu, Xi Li
Never having seen an object and heard its sound simultaneously, can the model still accurately localize its visual position from the input audio?
1 code implementation • ICCV 2023 • Longrong Yang, Xianpan Zhou, XueWei Li, Liang Qiao, Zheyang Li, Ziwei Yang, Gaoang Wang, Xi Li
Thus, the optimum of the distillation loss does not necessarily lead to the optimal student classification scores for dense object detectors.
no code implementations • 21 Aug 2023 • Xi Li, Songhe Wang, Ruiquan Huang, Mahanth Gowda, George Kesidis
Although there are extensive studies on backdoor attacks against image data, the susceptibility of video-based systems under backdoor attacks remains largely unexplored.
no code implementations • 18 Aug 2023 • Xi Li, Zhen Xiang, David J. Miller, George Kesidis
Backdoor (Trojan) attacks are an important type of adversarial exploit against deep neural networks (DNNs), wherein a test instance is (mis)classified to the attacker's target class whenever the attacker's backdoor trigger is present.
no code implementations • 25 Jul 2023 • Yiming Wu, Ruixiang Li, Zequn Qin, Xinhai Zhao, Xi Li
In this work, we propose to explicitly model heights in the BEV space, which needs no extra data like LiDAR and can fit arbitrary camera rigs and types compared to modeling depths.
1 code implementation • 6 Jun 2023 • XueWei Li, Tao Wu, Zhongang Qi, Gaoang Wang, Ying Shan, Xi Li
Experimental results on Stanford2D3D Panoramic datasets show that SGAT4PASS significantly improves performance and robustness, with approximately a 2% increase in mIoU, and when small 3D disturbances occur in the data, the stability of our performance is improved by an order of magnitude.
Ranked #4 on Semantic Segmentation on Stanford2D3D Panoramic
no code implementations • CVPR 2023 • Huanzhang Dou, Pengyi Zhang, Wei Su, Yunlong Yu, Yining Lin, Xi Li
Gait is one of the most promising biometrics that aims to identify pedestrians from their walking patterns.
no code implementations • 6 Jun 2023 • Huanzhang Dou, Pengyi Zhang, Wei Su, Yunlong Yu, Xi Li
Towards this goal, MetaGait injects meta-knowledge, which could guide the model to perceive sample-specific properties, into the calibration network of the attention mechanism to improve the adaptiveness from the omni-scale, omni-dimension, and omni-process perspectives.
no code implementations • 6 Jun 2023 • Yike Yuan, Xinghe Fu, Yunlong Yu, Xi Li
In this paper, we propose a simple yet effective transformer framework for self-supervised learning called DenseDINO to learn dense visual representations.
no code implementations • 6 Jun 2023 • Huanzhang Dou, Pengyi Zhang, Yuhan Zhao, Lin Dong, Zequn Qin, Xi Li
In this work, we propose to solve the hard sample issue with a Memory-augmented Progressive Learning network (GaitMPL), including Dynamic Reweighting Progressive Learning module (DRPL) and Global Structure-Aligned Memory bank (GSAM).
1 code implementation • CVPR 2023 • Wei Su, Peihan Miao, Huanzhang Dou, Gaoang Wang, Liang Qiao, Zheyang Li, Xi Li
The active perception can take expressions as priors to extract relevant visual features, which can effectively alleviate the mismatches.
1 code implementation • 6 Jun 2023 • Wei Su, Peihan Miao, Huanzhang Dou, Yongjian Fu, Xi Li
Different from universal object detection, referring expression comprehension (REC) aims to locate specific objects referred to by natural language expressions.
no code implementations • 16 May 2023 • Cyril Shih-Huan Hsu, Jorge Martín-Pérez, Danny De Vleeschauwer, Luca Valcarenghi, Xi Li, Chrysa Papagianni
Cellular-Vehicle-to-Everything (C-V2X) is currently at the forefront of the digital transformation of our society.
1 code implementation • 10 May 2023 • Chunyang Cheng, Tianyang Xu, Xiao-Jun Wu, Hui Li, Xi Li, Josef Kittler
We argue that there is a scope to improve the fusion performance with the help of the FusionBooster, a model specifically designed for the fusion task.
2 code implementations • CVPR 2023 • Guangcong Zheng, Xianpan Zhou, XueWei Li, Zhongang Qi, Ying Shan, Xi Li
To overcome the difficult multimodal fusion of image and layout, we propose to construct a structural image patch with region information and transform the patched image into a special layout to fuse with the normal layout in a unified form.
Ranked #1 on Layout-to-Image Generation on Visual Genome 128x128
no code implementations • 13 Feb 2023 • Shihao Su, Jianyun Xu, Huanyu Wang, Zhenwei Miao, Xin Zhan, Dayang Hao, Xi Li
Point cloud panoptic segmentation is a challenging task that seeks a holistic solution for both semantic and instance segmentation to predict groupings of coherent points.
no code implementations • CVPR 2023 • Luwen Duan, Min Wu, Lijian Mao, Jun Yin, Jianping Xiong, Xi Li
Automatic prohibited item detection in security inspection X-ray images is necessary for transportation. The abundance and diversity of the X-ray security images with prohibited item, termed as prohibited X-ray security images, are essential for training the detection model.
1 code implementation • CVPR 2023 • Xuan Zhang, Shiyu Li, Xi Li, Ping Huang, Jiulong Shan, Ting Chen
In this study, we propose an improved model called DeSTSeg, which integrates a pre-trained teacher network, a denoising student encoder-decoder, and a segmentation network into one framework.
Ranked #49 on Anomaly Detection on MVTec AD
no code implementations • 20 Nov 2022 • Shanshan Zhao, Mingming Gong, Xi Li, DaCheng Tao
To explore the role of the relation between edges, this paper proposes a novel Adaptive Edge-to-Edge Interaction Learning module, which aims to enhance the point-to-point relation through modelling the edge-to-edge interaction in the local region adaptively.
no code implementations • 28 Sep 2022 • Xintian Wu, Hanbin Zhao, Liangli Zheng, Shouhong Ding, Xi Li
Existing methods mainly extract the text information from only one sentence to represent an image and the text representation effects the quality of the generated image well.
2 code implementations • ICCV 2023 • Zequn Qin, Jingyu Chen, Chao Chen, Xiaozhi Chen, Xi Li
Bird's eye view (BEV) representation is a new perception formulation for autonomous driving, which is based on spatial fusion.
1 code implementation • 14 Jul 2022 • Ying Chen, Liang Qiao, Zhanzhan Cheng, ShiLiang Pu, Yi Niu, Xi Li
In this paper, to address this problem, we propose a novel cost-efficient Dynamic Low-resolution Distillation (DLD) text spotting framework, which aims to infer images in different small but recognizable resolutions and achieve a better balance between accuracy and efficiency.
Knowledge Distillation Optical Character Recognition (OCR) +1
1 code implementation • 23 Jun 2022 • Shengming Li, Guangcong Zheng, Hui Wang, Taiping Yao, Yang Chen, Shoudong Ding, Xi Li
Denoising Diffusion Probabilistic Model (DDPM) is able to make flexible conditional image generation from prior noise to real data, by introducing an independent noise-aware classifier to provide conditional gradient guidance at each time step of denoising process.
Ranked #2 on Conditional Image Generation on ImageNet 128x128
2 code implementations • 15 Jun 2022 • Zequn Qin, Pengyi Zhang, Xi Li
With the help of the anchor-driven representation, we then reformulate the lane detection task as an ordinal classification problem to get the coordinates of lanes.
1 code implementation • CVPR 2022 • Zequn Qin, Xi Li
To alleviate this problem, we propose to introduce the ground plane as a prior in the monocular 3d object detection.
no code implementations • 12 May 2022 • Xintian Wu, Qihang Zhang, Yiming Wu, Huanyu Wang, Songyuan Li, Lingyun Sun, Xi Li
Formulated as a conditional generation problem, face animation aims at synthesizing continuous face images from a single source image driven by a set of conditional face motion.
no code implementations • 12 May 2022 • Xintian Wu, Huanyu Wang, Yiming Wu, Xi Li
To transfer knowledge between discriminators, we design a multi-level discriminant knowledge distillation from the source discriminator to the target discriminator on both the real and fake samples.
no code implementations • 1 May 2022 • Jiling Cao, Jeong-Hoon Kim, Xi Li, Wenjun Zhang
In this paper, we derive closed-form formulas of first-order approximation for down-and-out barrier and floating strike lookback put option prices under a stochastic volatility model, by using an asymptotic approach.
no code implementations • 22 Apr 2022 • Pingping Dai, Haiming Zhu, Shuang Ge, Ruihan Zhang, Xiang Qian, Xi Li, Kehong Yuan
In this paper, inspired by self-training of semi-supervised learning, we pro? pose a novel approach to solve the lack of annotated data from another angle, called medical image pixel rearrangement (short in MIPR).
no code implementations • 21 Apr 2022 • Peihan Miao, Wei Su, Gaoang Wang, XueWei Li, Xi Li
As an important and challenging problem in vision-language tasks, referring expression comprehension (REC) generally requires a large amount of multi-grained information of visual and linguistic modalities to realize accurate reasoning.
no code implementations • 16 Mar 2022 • Hanbin Zhao, Fengyu Yang, Xinghe Fu, Xi Li
In practice, new images are usually made available in a consecutive manner, leading to a problem called Continual Semantic Segmentation (CSS).
no code implementations • 18 Feb 2022 • Jabeen Summaira, Xi Li, Amin Muhammad Shoib, Jabbar Abdul
Deep Learning has implemented a wide range of applications and has become increasingly popular in recent years.
1 code implementation • 10 Feb 2022 • Liangjun Feng, Chunhui Zhao, Xi Li
When training samples are scarce, the semantic embedding technique, ie, describing class labels with attributes, provides a condition to generate visual features for unseen objects by transferring the knowledge from seen objects.
no code implementations • 26 Jan 2022 • Hongruixuan Chen, Edoardo Nemni, Sofia Vallecorsa, Xi Li, Chen Wu, Lars Bromley
Considering the frontier advances of Transformer architecture in the computer vision field, in this paper, we present the first attempt at designing a Transformer-based damage assessment architecture (DamFormer).
Ranked #6 on Extracting Buildings In Remote Sensing Images on xBD
no code implementations • 13 Jan 2022 • Yifeng Chen, Wenqing Chu, Fangfang Wang, Ying Tai, Ran Yi, Zhenye Gan, Liang Yao, Chengjie Wang, Xi Li
Recently, there is growing attention on one-stage panoptic segmentation methods which aim to segment instances and stuff jointly within a fully convolutional pipeline efficiently.
no code implementations • 9 Jan 2022 • Xinrong Zhang, Zihou Ren, Xi Li, Shuqi Liu, Yunlong Deng, Yadi Xiao, Yuxing Han, Jiangtao Wen
The global influential factor of the reference to the citing paper is the product of the local influential factor and the total influential factor of the citing paper.
no code implementations • 6 Dec 2021 • Xi Li, Zhen Xiang, David J. Miller, George Kesidis
A DNN being attacked will predict to an attacker-desired target class whenever a test sample from any source class is embedded with a backdoor pattern; while correctly classifying clean (attack-free) test samples.
no code implementations • 20 Oct 2021 • Zhen Xiang, David J. Miller, Siheng Chen, Xi Li, George Kesidis
Backdoor attacks (BA) are an emerging threat to deep neural network classifiers.
1 code implementation • 12 Oct 2021 • Yiming Wu, Xintian Wu, Xi Li, Jian Tian
As a challenging task, unsupervised person ReID aims to match the same identity with query images which does not require any labeled information.
no code implementations • 6 Sep 2021 • Xi Li, George Kesidis, David J. Miller, Vladimir Lucic
We demonstrate a backdoor attack on a deep neural network used for regression.
no code implementations • 24 Aug 2021 • Lin William Cong, Xi Li, Ke Tang, Yang Yang
We introduce systematic tests exploiting robust statistical and behavioral patterns in trading to detect fake transactions on 29 cryptocurrency exchanges.
no code implementations • 28 Jul 2021 • Xi Li, George Kesidis, David J. Miller, Maxime Bergeron, Ryan Ferguson, Vladimir Lucic
We describe a gradient-based method to discover local error maximizers of a deep neural network (DNN) used for regression, assuming the availability of an "oracle" capable of providing real-valued supervision (a regression target) for samples.
no code implementations • 16 Jul 2021 • Muhammed Muzammul, Xi Li
At the end, we showed future directions with existing challenges of the field.
no code implementations • 13 Jul 2021 • Jiabao Cui, Pengyi Zhang, Songyuan Li, Liangli Zheng, Cuizhu Bao, Jupeng Xia, Xi Li
The key issue of the direct recognition is to preserve identity information of secret images into container images and make container images look similar to cover images at the same time.
no code implementations • 30 Jun 2021 • Hanbin Zhao, Xin Qin, Shihao Su, Yongjian Fu, Zibo Lin, Xi Li
With the rapid development of social media, tremendous videos with new classes are generated daily, which raise an urgent demand for video classification methods that can continuously update new classes while maintaining the knowledge of old videos with limited storage and computing resources.
no code implementations • 28 Jun 2021 • Hui Wang, Hanbin Zhao, Xi Li
In this paper, we propose a novel image process scheme called class-based expansion learning for image classification, which aims at improving the supervision-stimulation frequency for the samples of the confusing classes.
1 code implementation • 6 Jun 2021 • Qianren Mao, Xi Li, Bang Liu, Shu Guo, Peng Hao, JianXin Li, Lihong Wang
These tokens or phrases may originate from primary fragmental textual pieces (e. g., segments) in the original text and are separated into different segments.
no code implementations • 30 May 2021 • Pengyi Zhang, Huanzhang Dou, Wenhu Zhang, Yuhan Zhao, Songyuan Li, Zequn Qin, Xi Li
To diversify the extrinsic factors of gait, we build a complicated scene with a dense camera layout.
no code implementations • 29 May 2021 • Qianren Mao, Jiazheng Wang, Zheng Wang, Xi Li, Bo Li, JianXin Li
We meticulously analyze the corpus using well-known metrics, focusing on the style of the summaries and the complexity of the summarization task.
no code implementations • 28 May 2021 • Xi Li, David J. Miller, Zhen Xiang, George Kesidis
Data Poisoning (DP) is an effective attack that causes trained classifiers to misclassify their inputs.
no code implementations • 24 May 2021 • Jabeen Summaira, Xi Li, Amin Muhammad Shoib, Songyuan Li, Jabbar Abdul
Deep Learning has implemented a wide range of applications and has become increasingly popular in recent years.
no code implementations • 11 May 2021 • Xi Li, Meng Cao, Yingying Tang, Scott Johnston, Zhendong Hong, Huimin Ma, Jiulong Shan
Inspired by the observation that audiences have different visual preferences on foreground and background objects, we for the first time propose to use saliency masks in the evaluation processes of the task of video frame interpolation.
1 code implementation • CVPR 2021 • Peng Sun, Wenhu Zhang, Huanyu Wang, Songyuan Li, Xi Li
In principle, the feature modeling scheme is carried out in a depth-sensitive attention module, which leads to the RGB feature enhancement as well as the background distraction reduction by capturing the depth geometry prior.
no code implementations • 9 Mar 2021 • Xin Qin, Hanbin Zhao, Guangchen Lin, Hao Zeng, Songcen Xu, Xi Li
In this paper, we propose a temporal-position-sensitive context modeling approach to incorporate both positional and semantic information for more precise action localization.
no code implementations • 4 Mar 2021 • Hui Wang, Jian Tian, Songyuan Li, Hanbin Zhao, Qi Tian, Fei Wu, Xi Li
Unsupervised domain adaptation (UDA) typically carries out knowledge transfer from a label-rich source domain to an unlabeled target domain by adversarial learning.
1 code implementation • IEEE 2021 • Yingxu Qiao, Jiabao Cui, Fuxian Huang, Hongmin Liu, Cuizhu Bao, Xi Li
Photorealistic style transfer is a challenging task, which demands the stylized image remains real.
no code implementations • 29 Jan 2021 • Xi Li, Zehua Wang, Victor C. M. Leung, Hong Ji, Yiming Liu, Heli Zhang
The paths leading to future networks are pointing towards a data-driven paradigm to better cater to the explosive growth of mobile services as well as the increasing heterogeneity of mobile devices, many of which generate and consume large volumes and variety of data.
Networking and Internet Architecture
no code implementations • 5 Jan 2021 • Huanzhang Dou, Wenhu Zhang, Pengyi Zhang, Yuhan Zhao, Songyuan Li, Zequn Qin, Fei Wu, Lin Dong, Xi Li
With the motivation of practical gait recognition applications, we propose to automatically create a large-scale synthetic gait dataset (called VersatileGait) by a game engine, which consists of around one million silhouette sequences of 11, 000 subjects with fine-grained attributes in various complicated scenarios.
1 code implementation • ICCV 2021 • Huanyu Wang, Songyuan Li, Shihao Su, Zequn Qin, Xi Li
In this paper, we model the relations for dynamic inference from two aspects: the routers and the samples.
7 code implementations • ICCV 2021 • Zequn Qin, Pengyi Zhang, Fei Wu, Xi Li
With the proof, we naturally generalize the compression of the channel attention mechanism in the frequency domain and propose our method with multi-spectral channel attention, termed as FcaNet.
no code implementations • 8 Dec 2020 • Junyi Feng, Jiashen Hua, Baisheng Lai, Jianqiang Huang, Xi Li, Xian-Sheng Hua
To the best of our knowledge, our CADDet is the first work to introduce dynamic routing mechanism in object detection.
1 code implementation • 11 Aug 2020 • Fangfang Wang, Yifeng Chen, Fei Wu, Xi Li
Arbitrary-shaped text detection is a challenging task due to the complex geometric layouts of texts such as large aspect ratios, various scales, random rotations and curve shapes.
no code implementations • 4 Aug 2020 • Hanbin Zhao, Hui Wang, Yongjian Fu, Fei Wu, Xi Li
To cope with the forgetting problem, many CIL methods transfer the knowledge of old classes by preserving some exemplar samples into the size-constrained memory buffer.
no code implementations • 24 Jul 2020 • Hanbin Zhao, Hao Zeng, Xin Qin, Yongjian Fu, Hui Wang, Bourahla Omar, Xi Li
As an important and challenging problem, multi-domain learning (MDL) typically seeks for a set of effective lightweight domain-specific adapter modules plugged into a common domain-agnostic network.
no code implementations • 19 Jul 2020 • Xuandong Zhao, Jinbao Xue, Jin Yu, Xi Li, Hongxia Yang
In real-world applications, networks usually consist of billions of various types of nodes and edges with abundant attributes.
no code implementations • 13 Jul 2020 • Bin Li, Jian Tian, Zhongfei Zhang, Hailin Feng, Xi Li
Human motion prediction, which aims at predicting future human skeletons given the past ones, is a typical sequence-to-sequence problem.
no code implementations • 28 Jun 2020 • Hanbin Zhao, Yongjian Fu, Mintong Kang, Qi Tian, Fei Wu, Xi Li
As a challenging problem, few-shot class-incremental learning (FSCIL) continually learns a sequence of tasks, confronting the dilemma between slow forgetting of old knowledge and fast adaptation to new knowledge.
class-incremental learning Few-Shot Class-Incremental Learning +1
1 code implementation • 25 Jun 2020 • Jiabao Cui, XueWei Li, Bin Li, Hanbin Zhao, Bourahla Omar, Xi Li
In this paper, we propose a novel learning scheme called epoch-evolving Gaussian Process Guided Learning (GPGL), which aims at characterizing the correlation information between the batch-level distribution and the global data distribution.
no code implementations • 9 Jun 2020 • Abdul Jabbar, Xi Li, Bourahla Omar
We survey, (I) the original GAN model and its modified classical versions, (II) detail analysis of various GAN applications in different domains, (III) detail study about the various GAN training obstacles as well as training solutions.
no code implementations • 8 Jun 2020 • Xuewei Li, Songyuan Li, Bourahla Omar, Fei Wu, Xi Li
In this paper, we see knowledge distillation in a fresh light, using the knowledge gap, or the residual, between a teacher and a student as guidance to train a much more lightweight student, called a res-student.
1 code implementation • 29 May 2020 • Huanyu Wang, Zequn Qin, Songyuan Li, Xi Li
In this paper, we see dynamic routing networks in a fresh light, formulating a routing method as a mapping from a sample space to a routing space.
no code implementations • 21 May 2020 • Xi Li, Huimin Ma, Hongbing Ma, Yidong Wang
In order to solve this problem, the research proposes an unsupervised foreground segmentation method based on semantic-apparent feature fusion (SAFF).
no code implementations • 4 May 2020 • Songyuan Li, Junyi Feng, Xi Li
Based on the feature fusion, our Context Feature Rectification~(CFR) module learns the model's difference from a per-frame model to correct the warped features.
no code implementations • 27 Apr 2020 • Yongjian Fu, Xintian Wu, Xi Li, Zhijie Pan, Daxin Luo
Different from many other attributes, facial expression can change in a continuous way, and therefore, a slight semantic change of input should also lead to the output fluctuation limited in a small scale.
Facial Expression Recognition Facial Expression Recognition (FER)
10 code implementations • ECCV 2020 • Zequn Qin, Huanyu Wang, Xi Li
Modern methods mainly regard lane detection as a problem of pixel-wise segmentation, which is struggling to address the problem of challenging scenarios and speed.
Ranked #50 on Lane Detection on CULane
no code implementations • 1 Apr 2020 • Weichao Li, Xi Li, Omar Elfarouk Bourahla, Fuxian Huang, Fei Wu, Wei Liu, Zhiheng Wang, Hongmin Liu
Visual tracking is typically solved as a discriminative learning problem that usually requires high-quality samples for online model adaptation.
no code implementations • 31 Mar 2020 • Peng Sun, Jiaxiang Wu, Songyuan Li, Peiwen Lin, Junzhou Huang, Xi Li
To satisfy the stringent requirements on computational resources in the field of real-time semantic segmentation, most approaches focus on the hand-crafted design of light-weight segmentation networks.
Neural Architecture Search Real-Time Semantic Segmentation +1
no code implementations • CVPR 2020 • Yifeng Chen, Guangchen Lin, Songyuan Li, Bourahla Omar, Yiming Wu, Fangfang Wang, Junyi Feng, Mingliang Xu, Xi Li
Panoptic segmentation aims to perform instance segmentation for foreground instances and semantic segmentation for background stuff simultaneously.
1 code implementation • 30 Mar 2020 • Junyi Feng, Songyuan Li, Xi Li, Fei Wu, Qi Tian, Ming-Hsuan Yang, Haibin Ling
Real-time semantic video segmentation is a challenging task due to the strict requirements of inference speed.
no code implementations • 10 Mar 2020 • Xi Li, Huimin Ma, Sheng Yi, Yanxian Chen
Semantic segmentation tasks based on weakly supervised condition have been put forward to achieve a lightweight labeling process.
no code implementations • 26 Nov 2019 • Sheng Yi, Xi Li, Huimin Ma
To solve this problem, we added the box regression module to the weakly supervised object detection network and proposed a proposal scoring network (PSNet) to supervise it.
Ranked #22 on Weakly Supervised Object Detection on PASCAL VOC 2007
1 code implementation • CVPR 2020 • Peiwen Lin, Peng Sun, Guangliang Cheng, Sirui Xie, Xi Li, Jianping Shi
Unlike previous works that use a simplified search space and stack a repeatable cell to form a network, we introduce a novel search mechanism with new search space where a lightweight model can be effectively explored through the cell-level diversity and latencyoriented constraint.
1 code implementation • 5 Sep 2019 • Yiming Wu, Omar El Farouk Bourahla, Xi Li, Fei Wu, Qi Tian, Xue Zhou
While correlations between parts are ignored in the previous methods, to leverage the relations of different parts, we propose an innovative adaptive graph representation learning scheme for video person Re-ID, which enables the contextual interactions between relevant regional features.
Ranked #3 on Person Re-Identification on PRID2011
Graph Representation Learning Video-Based Person Re-Identification
no code implementations • Thirty-Third AAAI Conference on Artificial Intelligence, 2019 • Bin Li, Xi Li, Zhongfei Zhang, Fei Wu
With the representation effectiveness, skeleton-based human action recognition has received considerable research attention, and has a wide range of real applications.
no code implementations • 24 May 2019 • Peng Sun, Peiwen Lin, Guangliang Cheng, Jianping Shi, Jiawan Zhang, Xi Li
Video object segmentation aims at accurately segmenting the target object regions across consecutive frames.
1 code implementation • 18 May 2019 • George Kesidis, Nader Alfares, Xi Li, Bhuvan Urgaonkar, Mahmut Kandemir, Takis Konstantopoulos
We consider a content-caching system thatis shared by a number of proxies.
Performance Networking and Internet Architecture
no code implementations • 19 Apr 2019 • Yunze Man, Yangsibo Huang, Junyi Feng, Xi Li, Fei Wu
Segmentation of pancreas is important for medical image analysis, yet it faces great challenges of class imbalance, background distractions and non-rigid geometrical features.
no code implementations • 29 Nov 2018 • Siyu Huang, Zhi-Qi Cheng, Xi Li, Xiao Wu, Zhongfei Zhang, Alexander Hauptmann
To tackle this challenge, we present a novel pipeline comprised of an Observer Engine and a Physicist Engine by respectively imitating the actions of an observer and a physicist in the real world.
no code implementations • 17 Nov 2018 • Yunze Man, Xinshuo Weng, Xi Li, Kris Kitani
We focus on estimating the 3D orientation of the ground plane from a single image.
no code implementations • 6 Oct 2018 • Yiming Wu, Wei Ji, Xi Li, Gang Wang, Jianwei Yin, Fei Wu
As a fundamental and challenging problem in computer vision, hand pose estimation aims to estimate the hand joint locations from depth images.
1 code implementation • 22 Aug 2018 • Siyu Huang, Xi Li, Zhi-Qi Cheng, Zhongfei Zhang, Alexander Hauptmann
In this work, we explore the cross-scale similarity in crowd counting scenario, in which the regions of different scales often exhibit high visual similarity.
no code implementations • CVPR 2018 • Xiang Wang, ShaoDi You, Xi Li, Huimin Ma
Then in the top-down step, the refined object regions are used as supervision to train the segmentation network and to predict object masks.
no code implementations • CVPR 2018 • Fangfang Wang, Liming Zhao, Xi Li, Xinchao Wang, DaCheng Tao
Localizing text in the wild is challenging in the situations of complicated geometric layout of the targets like random orientation and large aspect ratio.
no code implementations • 23 Apr 2018 • Weichao Li, Fuxian Huang, Xi Li, Gang Pan, Fei Wu
A critical and challenging problem in reinforcement learning is how to learn the state-action value function from the experience replay buffer and simultaneously keep sample efficiency and faster convergence to a high quality solution.
no code implementations • 19 Apr 2018 • Siyu Huang, Xi Li, Zhi-Qi Cheng, Zhongfei Zhang, Alexander Hauptmann
A key problem in deep multi-attribute learning is to effectively discover the inter-attribute correlation structures.
no code implementations • 7 Mar 2018 • Chaojie Mao, Yingming Li, Yaqing Zhang, Zhongfei Zhang, Xi Li
In particular, we learn separate deep representations for semantic-components and color-texture distributions from two person images and then employ pyramid person matching network (PPMN) to obtain correspondence representations.
no code implementations • 7 Mar 2018 • Chaojie Mao, Yingming Li, Zhongfei Zhang, Yaqing Zhang, Xi Li
In this work, we present a deep convolutional pyramid person matching network (PPMN) with specially designed Pyramid Matching Module to address the problem of person re-identification.
no code implementations • 2 Dec 2017 • Chongyi Li, Jichang Guo, Fatih Porikli, Chunle Guo, Huzhu Fu, Xi Li
Despite the recent progress in image dehazing, several problems remain largely unsolved such as robustness for varying scenes, the visual quality of reconstructed images, and effectiveness and flexibility for applications.
no code implementations • 2 Nov 2017 • Zhongang Qi, Tianchun Wang, Guojie Song, Weisong Hu, Xi Li, Zhongfei, Zhang
The interpolation, prediction, and feature analysis of fine-gained air quality are three important topics in the area of urban air computing.
no code implementations • 25 Jul 2017 • Lina Wei, Fangfang Wang, Xi Li, Fei Wu, Jun Xiao
As a result, a key issue in video saliency detection is how to effectively capture the intrinsical properties of atomic video structures as well as their associated contextual interactions along the spatial and temporal dimensions.
no code implementations • 25 Jul 2017 • Te Pi, Xi Li, Zhongfei, Zhang
For adaptable knowledge transfer, we devise a Semantic Correlation Regularization (SCR) approach to regularize the boosted model to be consistent with the inter-class semantic correlations.
no code implementations • 24 Jul 2017 • Lina Wei, Shanshan Zhao, Omar El Farouk Bourahla, Xi Li, Fei Wu
In this paper, we propose an end-to-end group-wise deep co-saliency detection approach to address the co-salient object discovery problem based on the fully convolutional network (FCN) with group input and group output.
1 code implementation • ICCV 2017 • Liming Zhao, Xi Li, Jingdong Wang, Yueting Zhuang
In this paper, we address the problem of person re-identification, which refers to associating the persons captured from different cameras.
Ranked #110 on Person Re-Identification on Market-1501
no code implementations • 23 Jul 2017 • Shanshan Zhao, Xi Li, Omar El Farouk Bourahla
Therefore, a key issue to solve in this area is how to effectively model the multi-scale correspondence structure properties in an adaptive end-to-end learning fashion.
no code implementations • 27 Mar 2017 • Yunlong Yu, Zhong Ji, Xi Li, Jichang Guo, Zhongfei Zhang, Haibin Ling, Fei Wu
As an important and challenging problem in computer vision, zero-shot learning (ZSL) aims at automatically recognizing the instances from unseen object classes without training data.
4 code implementations • 23 Nov 2016 • Liming Zhao, Jingdong Wang, Xi Li, Zhuowen Tu, Wen-Jun Zeng
A deep residual network, built by stacking a sequence of residual blocks, is easy to train, because identity mappings skip residual branches and thus improve information flow.
no code implementations • 23 May 2016 • Chao Wang, Qi Yu, Lei Gong, Xi Li, Yuan Xie, Xuehai Zhou
As the emerging field of machine learning, deep learning shows excellent ability in solving complex learning problems.
no code implementations • 27 Jan 2016 • Siyu Huang, Xi Li, Zhongfei Zhang, Zhouzhou He, Fei Wu, Wei Liu, Jinhui Tang, Yueting Zhuang
The highly effective visual representation and deep context models ensure that our framework makes a deep semantic understanding of the scene and motion pattern, consequently improving the performance of the visual path prediction task.
no code implementations • ICCV 2015 • Peiyi Li, Haibin Ling, Xi Li, Chunyuan Liao
In this paper, we propose a real-time 3D hand pose estimation algorithm using the randomized decision forest framework.
no code implementations • 19 Oct 2015 • Xi Li, Liming Zhao, Lina Wei, Ming-Hsuan Yang, Fei Wu, Yueting Zhuang, Haibin Ling, Jingdong Wang
A key problem in salient object detection is how to effectively model the semantic properties of salient objects in a data-driven manner.
no code implementations • 21 Jul 2015 • Xi Li, Chunhua Shen, Anthony Dick, Zhongfei Zhang, Yueting Zhuang
Object identification results for an entire video sequence are achieved by systematically combining the tracking information and visual recognition at each frame.
no code implementations • 4 Dec 2014 • Liming Zhao, Xi Li, Jun Xiao, Fei Wu, Yueting Zhuang
As an important and challenging problem in computer vision and graphics, keypoint-based object tracking is typically formulated in a spatio-temporal statistical learning framework.
no code implementations • 4 Jan 2014 • Xi Li, Weiming Hu, Chunhua Shen, Anthony Dick, Zhongfei Zhang
Using both CAHSM and DHPC, a robust spectral clustering algorithm is developed.
no code implementations • 22 Oct 2013 • Xi Li, Yao Li, Chunhua Shen, Anthony Dick, Anton Van Den Hengel
In this work, we model an image as a hypergraph that utilizes a set of hyperedges to capture the contextual properties of image pixels or regions.
no code implementations • CVPR 2013 • Xi Li, Chunhua Shen, Anthony Dick, Anton Van Den Hengel
A key problem in visual tracking is to represent the appearance of an object in a way that is robust to visual changes.
no code implementations • CVPR 2013 • Chunfeng Yuan, Xi Li, Weiming Hu, Haibin Ling, Stephen Maybank
In this paper, we propose a new global feature to capture the detailed geometrical distribution of interest points.