Search Results for author: Xi Li

Found 123 papers, 34 papers with code

Joint Physical-Digital Facial Attack Detection Via Simulating Spoofing Clues

2 code implementations12 Apr 2024 Xianhua He, Dashuang Liang, Song Yang, Zhanlong Hao, Hui Ma, Binjie Mao, Xi Li, Yao Wang, Pengfei Yan, Ajian Liu

SPSC and SDSC augment live samples into simulated attack samples by simulating spoofing clues of physical and digital attacks, respectively, which significantly improve the capability of the model to detect "unseen" attack types.

Data Augmentation Face Anti-Spoofing +1

SphereDiffusion: Spherical Geometry-Aware Distortion Resilient Diffusion Model

no code implementations15 Mar 2024 Tao Wu, XueWei Li, Zhongang Qi, Di Hu, Xintao Wang, Ying Shan, Xi Li

Controllable spherical panoramic image generation holds substantial applicative potential across a variety of domains. However, it remains a challenging task due to the inherent spherical distortion and geometry characteristics, resulting in low-quality content generation. In this paper, we introduce a novel framework of SphereDiffusion to address these unique challenges, for better generating high-quality and precisely controllable spherical panoramic images. For the spherical distortion characteristic, we embed the semantics of the distorted object with text encoding, then explicitly construct the relationship with text-object correspondence to better use the pre-trained knowledge of the planar images. Meanwhile, we employ a deformable technique to mitigate the semantic deviation in latent space caused by spherical distortion. For the spherical geometry characteristic, in virtue of spherical rotation invariance, we improve the data diversity and optimization objectives in the training process, enabling the model to better learn the spherical geometry characteristic. Furthermore, we enhance the denoising process of the diffusion model, enabling it to effectively use the learned geometric characteristic to ensure the boundary continuity of the generated images. With these specific techniques, experiments on Structured3D dataset show that SphereDiffusion significantly improves the quality of controllable spherical image generation and relatively reduces around 35% FID on average.

Denoising Image Generation

Personalized Behavior-Aware Transformer for Multi-Behavior Sequential Recommendation

1 code implementation22 Feb 2024 Jiajie Su, Chaochao Chen, Zibin Lin, Xi Li, Weiming Liu, Xiaolin Zheng

To tackle these challenges, we propose a Personalized Behavior-Aware Transformer framework (PBAT) for MBSR problem, which models personalized patterns and multifaceted sequential collaborations in a novel way to boost recommendation performance.

Sequential Recommendation

Universal Post-Training Reverse-Engineering Defense Against Backdoors in Deep Neural Networks

no code implementations3 Feb 2024 Xi Li, Hang Wang, David J. Miller, George Kesidis

A variety of defenses have been proposed against backdoors attacks on deep neural network (DNN) classifiers.

Position Paper: Assessing Robustness, Privacy, and Fairness in Federated Learning Integrated with Foundation Models

no code implementations2 Feb 2024 Xi Li, Jiaqi Wang

Federated Learning (FL), while a breakthrough in decentralized machine learning, contends with significant challenges such as limited data availability and the variability of computational resources, which can stifle the performance and scalability of the models.

Data Augmentation Fairness +2

Vulnerabilities of Foundation Model Integrated Federated Learning Under Adversarial Threats

no code implementations18 Jan 2024 Chen Wu, Xi Li, Jiaqi Wang

Federated Learning (FL) addresses critical issues in machine learning related to data privacy and security, yet suffering from data insufficiency and imbalance under certain circumstances.

Federated Learning

TextFusion: Unveiling the Power of Textual Semantics for Controllable Image Fusion

1 code implementation21 Dec 2023 Chunyang Cheng, Tianyang Xu, Xiao-Jun Wu, Hui Li, Xi Li, Zhangyong Tang, Josef Kittler

Advanced image fusion methods are devoted to generating the fusion results by aggregating the complementary information conveyed by the source images.

Image Quality Assessment Language Modelling

Prompting Segmentation with Sound Is Generalizable Audio-Visual Source Localizer

1 code implementation13 Sep 2023 Yaoting Wang, Weisong Liu, Guangyao Li, Jian Ding, Di Hu, Xi Li

Never having seen an object and heard its sound simultaneously, can the model still accurately localize its visual position from the input audio?

CoLA Visual Localization

Temporal-Distributed Backdoor Attack Against Video Based Action Recognition

no code implementations21 Aug 2023 Xi Li, Songhe Wang, Ruiquan Huang, Mahanth Gowda, George Kesidis

Although there are extensive studies on backdoor attacks against image data, the susceptibility of video-based systems under backdoor attacks remains largely unexplored.

Action Recognition Backdoor Attack +3

Backdoor Mitigation by Correcting the Distribution of Neural Activations

no code implementations18 Aug 2023 Xi Li, Zhen Xiang, David J. Miller, George Kesidis

Backdoor (Trojan) attacks are an important type of adversarial exploit against deep neural networks (DNNs), wherein a test instance is (mis)classified to the attacker's target class whenever the attacker's backdoor trigger is present.

HeightFormer: Explicit Height Modeling without Extra Data for Camera-only 3D Object Detection in Bird's Eye View

no code implementations25 Jul 2023 Yiming Wu, Ruixiang Li, Zequn Qin, Xinhai Zhao, Xi Li

In this work, we propose to explicitly model heights in the BEV space, which needs no extra data like LiDAR and can fit arbitrary camera rigs and types compared to modeling depths.

3D Object Detection Autonomous Driving +1

Referring Expression Comprehension Using Language Adaptive Inference

1 code implementation6 Jun 2023 Wei Su, Peihan Miao, Huanzhang Dou, Yongjian Fu, Xi Li

Different from universal object detection, referring expression comprehension (REC) aims to locate specific objects referred to by natural language expressions.

object-detection Object Detection +2

DenseDINO: Boosting Dense Self-Supervised Learning with Token-Based Point-Level Consistency

no code implementations6 Jun 2023 Yike Yuan, Xinghe Fu, Yunlong Yu, Xi Li

In this paper, we propose a simple yet effective transformer framework for self-supervised learning called DenseDINO to learn dense visual representations.

Position Segmentation +2

SGAT4PASS: Spherical Geometry-Aware Transformer for PAnoramic Semantic Segmentation

1 code implementation6 Jun 2023 XueWei Li, Tao Wu, Zhongang Qi, Gaoang Wang, Ying Shan, Xi Li

Experimental results on Stanford2D3D Panoramic datasets show that SGAT4PASS significantly improves performance and robustness, with approximately a 2% increase in mIoU, and when small 3D disturbances occur in the data, the stability of our performance is improved by an order of magnitude.

Semantic Segmentation

MetaGait: Learning to Learn an Omni Sample Adaptive Representation for Gait Recognition

no code implementations6 Jun 2023 Huanzhang Dou, Pengyi Zhang, Wei Su, Yunlong Yu, Xi Li

Towards this goal, MetaGait injects meta-knowledge, which could guide the model to perceive sample-specific properties, into the calibration network of the attention mechanism to improve the adaptiveness from the omni-scale, omni-dimension, and omni-process perspectives.

Gait Recognition

GaitMPL: Gait Recognition with Memory-Augmented Progressive Learning

no code implementations6 Jun 2023 Huanzhang Dou, Pengyi Zhang, Yuhan Zhao, Lin Dong, Zequn Qin, Xi Li

In this work, we propose to solve the hard sample issue with a Memory-augmented Progressive Learning network (GaitMPL), including Dynamic Reweighting Progressive Learning module (DRPL) and Global Structure-Aligned Memory bank (GSAM).

Gait Recognition

A Deep RL Approach on Task Placement and Scaling of Edge Resources for Cellular Vehicle-to-Network Service Provisioning

no code implementations16 May 2023 Cyril Shih-Huan Hsu, Jorge Martín-Pérez, Danny De Vleeschauwer, Koteswararao Kondepu, Luca Valcarenghi, Xi Li, Chrysa Papagianni

By conducting a complexity analysis, we prove that DDPG-based solutions achieve runtimes in the range of sub-milliseconds, meeting the strict latency requirements of C-V2N services.

Decision Making

FusionBooster: A Unified Image Fusion Boosting Paradigm

1 code implementation10 May 2023 Chunyang Cheng, Tianyang Xu, Xiao-Jun Wu, Hui Li, Xi Li, Josef Kittler

We argue that there is a scope to improve the fusion performance with the help of the FusionBooster, a model specifically designed for the fusion task.

LayoutDiffusion: Controllable Diffusion Model for Layout-to-image Generation

2 code implementations CVPR 2023 Guangcong Zheng, Xianpan Zhou, XueWei Li, Zhongang Qi, Ying Shan, Xi Li

To overcome the difficult multimodal fusion of image and layout, we propose to construct a structural image patch with region information and transform the patched image into a special layout to fuse with the normal layout in a unified form.

Layout-to-Image Generation Object

PUPS: Point Cloud Unified Panoptic Segmentation

no code implementations13 Feb 2023 Shihao Su, Jianyun Xu, Huanyu Wang, Zhenwei Miao, Xin Zhan, Dayang Hao, Xi Li

Point cloud panoptic segmentation is a challenging task that seeks a holistic solution for both semantic and instance segmentation to predict groupings of coherent points.

Instance Segmentation Panoptic Segmentation +1

RWSC-Fusion: Region-Wise Style-Controlled Fusion Network for the Prohibited X-Ray Security Image Synthesis

no code implementations CVPR 2023 Luwen Duan, Min Wu, Lijian Mao, Jun Yin, Jianping Xiong, Xi Li

Automatic prohibited item detection in security inspection X-ray images is necessary for transportation. The abundance and diversity of the X-ray security images with prohibited item, termed as prohibited X-ray security images, are essential for training the detection model.

Image Generation

DeSTSeg: Segmentation Guided Denoising Student-Teacher for Anomaly Detection

1 code implementation CVPR 2023 Xuan Zhang, Shiyu Li, Xi Li, Ping Huang, Jiulong Shan, Ting Chen

In this study, we propose an improved model called DeSTSeg, which integrates a pre-trained teacher network, a denoising student encoder-decoder, and a segmentation network into one framework.

Denoising One-Class Classification +1

Adaptive Edge-to-Edge Interaction Learning for Point Cloud Analysis

no code implementations20 Nov 2022 Shanshan Zhao, Mingming Gong, Xi Li, DaCheng Tao

To explore the role of the relation between edges, this paper proposes a novel Adaptive Edge-to-Edge Interaction Learning module, which aims to enhance the point-to-point relation through modelling the edge-to-edge interaction in the local region adaptively.

Relation Semantic Segmentation

Adma-GAN: Attribute-Driven Memory Augmented GANs for Text-to-Image Generation

no code implementations28 Sep 2022 Xintian Wu, Hanbin Zhao, Liangli Zheng, Shouhong Ding, Xi Li

Existing methods mainly extract the text information from only one sentence to represent an image and the text representation effects the quality of the generated image well.

Attribute Sentence +1

UniFusion: Unified Multi-view Fusion Transformer for Spatial-Temporal Representation in Bird's-Eye-View

2 code implementations ICCV 2023 Zequn Qin, Jingyu Chen, Chao Chen, Xiaozhi Chen, Xi Li

Bird's eye view (BEV) representation is a new perception formulation for autonomous driving, which is based on spatial fusion.

Autonomous Driving

Dynamic Low-Resolution Distillation for Cost-Efficient End-to-End Text Spotting

1 code implementation14 Jul 2022 Ying Chen, Liang Qiao, Zhanzhan Cheng, ShiLiang Pu, Yi Niu, Xi Li

In this paper, to address this problem, we propose a novel cost-efficient Dynamic Low-resolution Distillation (DLD) text spotting framework, which aims to infer images in different small but recognizable resolutions and achieve a better balance between accuracy and efficiency.

Knowledge Distillation Optical Character Recognition (OCR) +1

Entropy-driven Sampling and Training Scheme for Conditional Diffusion Generation

1 code implementation23 Jun 2022 Shengming Li, Guangcong Zheng, Hui Wang, Taiping Yao, Yang Chen, Shoudong Ding, Xi Li

Denoising Diffusion Probabilistic Model (DDPM) is able to make flexible conditional image generation from prior noise to real data, by introducing an independent noise-aware classifier to provide conditional gradient guidance at each time step of denoising process.

Conditional Image Generation Denoising

MonoGround: Detecting Monocular 3D Objects from the Ground

1 code implementation CVPR 2022 Zequn Qin, Xi Li

To alleviate this problem, we propose to introduce the ground plane as a prior in the monocular 3d object detection.

Depth Estimation Monocular 3D Object Detection +2

Ultra Fast Deep Lane Detection with Hybrid Anchor Driven Ordinal Classification

2 code implementations15 Jun 2022 Zequn Qin, Pengyi Zhang, Xi Li

With the help of the anchor-driven representation, we then reformulate the lane detection task as an ordinal classification problem to get the coordinates of lanes.

Lane Detection Ordinal Classification

D3T-GAN: Data-Dependent Domain Transfer GANs for Few-shot Image Generation

no code implementations12 May 2022 Xintian Wu, Huanyu Wang, Yiming Wu, Xi Li

To transfer knowledge between discriminators, we design a multi-level discriminant knowledge distillation from the source discriminator to the target discriminator on both the real and fake samples.

Image Generation Knowledge Distillation +1

F3A-GAN: Facial Flow for Face Animation with Generative Adversarial Networks

no code implementations12 May 2022 Xintian Wu, Qihang Zhang, Yiming Wu, Huanyu Wang, Songyuan Li, Lingyun Sun, Xi Li

Formulated as a conditional generation problem, face animation aims at synthesizing continuous face images from a single source image driven by a set of conditional face motion.

Pricing Path-dependent Options under Stochastic Volatility via Mellin Transform

no code implementations1 May 2022 Jiling Cao, Jeong-Hoon Kim, Xi Li, Wenjun Zhang

In this paper, we derive closed-form formulas of first-order approximation for down-and-out barrier and floating strike lookback put option prices under a stochastic volatility model, by using an asymptotic approach.

MIPR:Automatic Annotation of Medical Images with Pixel Rearrangement

no code implementations22 Apr 2022 Pingping Dai, Haiming Zhu, Shuang Ge, Ruihan Zhang, Xiang Qian, Xi Li, Kehong Yuan

In this paper, inspired by self-training of semi-supervised learning, we pro? pose a novel approach to solve the lack of annotated data from another angle, called medical image pixel rearrangement (short in MIPR).

Pseudo Label Segmentation +1

Self-paced Multi-grained Cross-modal Interaction Modeling for Referring Expression Comprehension

no code implementations21 Apr 2022 Peihan Miao, Wei Su, Gaoang Wang, XueWei Li, Xi Li

As an important and challenging problem in vision-language tasks, referring expression comprehension (REC) generally requires a large amount of multi-grained information of visual and linguistic modalities to realize accurate reasoning.

Informativeness Referring Expression +1

RBC: Rectifying the Biased Context in Continual Semantic Segmentation

no code implementations16 Mar 2022 Hanbin Zhao, Fengyu Yang, Xinghe Fu, Xi Li

In practice, new images are usually made available in a consecutive manner, leading to a problem called Continual Semantic Segmentation (CSS).

Continual Semantic Segmentation Segmentation +1

A Review on Methods and Applications in Multimodal Deep Learning

no code implementations18 Feb 2022 Jabeen Summaira, Xi Li, Amin Muhammad Shoib, Jabbar Abdul

Deep Learning has implemented a wide range of applications and has become increasingly popular in recent years.

Multimodal Deep Learning

Bias-Eliminated Semantic Refinement for Any-Shot Learning

1 code implementation10 Feb 2022 Liangjun Feng, Chunhui Zhao, Xi Li

When training samples are scarce, the semantic embedding technique, ie, describing class labels with attributes, provides a condition to generate visual features for unseen objects by transferring the knowledge from seen objects.

Few-Shot Learning Generalized Zero-Shot Learning +1

Dual-Tasks Siamese Transformer Framework for Building Damage Assessment

no code implementations26 Jan 2022 Hongruixuan Chen, Edoardo Nemni, Sofia Vallecorsa, Xi Li, Chen Wu, Lars Bromley

Considering the frontier advances of Transformer architecture in the computer vision field, in this paper, we present the first attempt at designing a Transformer-based damage assessment architecture (DamFormer).

Disaster Response Extracting Buildings In Remote Sensing Images +1

CFNet: Learning Correlation Functions for One-Stage Panoptic Segmentation

no code implementations13 Jan 2022 Yifeng Chen, Wenqing Chu, Fangfang Wang, Ying Tai, Ran Yi, Zhenye Gan, Liang Yao, Chengjie Wang, Xi Li

Recently, there is growing attention on one-stage panoptic segmentation methods which aim to segment instances and stuff jointly within a fully convolutional pipeline efficiently.

Instance Segmentation Panoptic Segmentation +1

Phocus: Picking Valuable Research from a Sea of Citations

no code implementations9 Jan 2022 Xinrong Zhang, Zihou Ren, Xi Li, Shuqi Liu, Yunlong Deng, Yadi Xiao, Yuxing Han, Jiangtao Wen

The global influential factor of the reference to the citing paper is the product of the local influential factor and the total influential factor of the citing paper.

Sentence

Test-Time Detection of Backdoor Triggers for Poisoned Deep Neural Networks

no code implementations6 Dec 2021 Xi Li, Zhen Xiang, David J. Miller, George Kesidis

A DNN being attacked will predict to an attacker-desired target class whenever a test sample from any source class is embedded with a backdoor pattern; while correctly classifying clean (attack-free) test samples.

Backdoor Attack Image Classification

MGH: Metadata Guided Hypergraph Modeling for Unsupervised Person Re-identification

1 code implementation12 Oct 2021 Yiming Wu, Xintian Wu, Xi Li, Jian Tian

As a challenging task, unsupervised person ReID aims to match the same identity with query images which does not require any labeled information.

Unsupervised Person Re-Identification

Crypto Wash Trading

no code implementations24 Aug 2021 Lin William Cong, Xi Li, Ke Tang, Yang Yang

We introduce systematic tests exploiting robust statistical and behavioral patterns in trading to detect fake transactions on 29 cryptocurrency exchanges.

Robust and Active Learning for Deep Neural Network Regression

no code implementations28 Jul 2021 Xi Li, George Kesidis, David J. Miller, Maxime Bergeron, Ryan Ferguson, Vladimir Lucic

We describe a gradient-based method to discover local error maximizers of a deep neural network (DNN) used for regression, assuming the availability of an "oracle" capable of providing real-valued supervision (a regression target) for samples.

Active Learning regression

Multitask Identity-Aware Image Steganography via Minimax Optimization

no code implementations13 Jul 2021 Jiabao Cui, Pengyi Zhang, Songyuan Li, Liangli Zheng, Cuizhu Bao, Jupeng Xia, Xi Li

The key issue of the direct recognition is to preserve identity information of secret images into container images and make container images look similar to cover images at the same time.

Image Restoration Image Steganography

When Video Classification Meets Incremental Classes

no code implementations30 Jun 2021 Hanbin Zhao, Xin Qin, Shihao Su, Yongjian Fu, Zibo Lin, Xi Li

With the rapid development of social media, tremendous videos with new classes are generated daily, which raise an urgent demand for video classification methods that can continuously update new classes while maintaining the knowledge of old videos with limited storage and computing resources.

Classification Class Incremental Learning +3

Progressive Class-based Expansion Learning For Image Classification

no code implementations28 Jun 2021 Hui Wang, Hanbin Zhao, Xi Li

In this paper, we propose a novel image process scheme called class-based expansion learning for image classification, which aims at improving the supervision-stimulation frequency for the samples of the confusing classes.

Classification Image Classification

Attend and select: A segment selective transformer for microblog hashtag generation

1 code implementation6 Jun 2021 Qianren Mao, Xi Li, Bang Liu, Shu Guo, Peng Hao, JianXin Li, Lihong Wang

These tokens or phrases may originate from primary fragmental textual pieces (e. g., segments) in the original text and are separated into different segments.

CNTLS: A Benchmark Dataset for Abstractive or Extractive Chinese Timeline Summarization

no code implementations29 May 2021 Qianren Mao, Jiazheng Wang, Zheng Wang, Xi Li, Bo Li, JianXin Li

We meticulously analyze the corpus using well-known metrics, focusing on the style of the summaries and the complexity of the summarization task.

Information Retrieval Retrieval +3

A BIC-based Mixture Model Defense against Data Poisoning Attacks on Classifiers

no code implementations28 May 2021 Xi Li, David J. Miller, Zhen Xiang, George Kesidis

Data Poisoning (DP) is an effective attack that causes trained classifiers to misclassify their inputs.

Data Poisoning

Recent Advances and Trends in Multimodal Deep Learning: A Review

no code implementations24 May 2021 Jabeen Summaira, Xi Li, Amin Muhammad Shoib, Songyuan Li, Jabbar Abdul

Deep Learning has implemented a wide range of applications and has become increasingly popular in recent years.

Multimodal Deep Learning

Video Frame Interpolation via Structure-Motion based Iterative Fusion

no code implementations11 May 2021 Xi Li, Meng Cao, Yingying Tang, Scott Johnston, Zhendong Hong, Huimin Ma, Jiulong Shan

Inspired by the observation that audiences have different visual preferences on foreground and background objects, we for the first time propose to use saliency masks in the evaluation processes of the task of video frame interpolation.

Optical Flow Estimation Video Frame Interpolation

Deep RGB-D Saliency Detection with Depth-Sensitive Attention and Automatic Multi-Modal Fusion

1 code implementation CVPR 2021 Peng Sun, Wenhu Zhang, Huanyu Wang, Songyuan Li, Xi Li

In principle, the feature modeling scheme is carried out in a depth-sensitive attention module, which leads to the RGB feature enhancement as well as the background distraction reduction by capturing the depth geometry prior.

object-detection RGB-D Salient Object Detection +2

PcmNet: Position-Sensitive Context Modeling Network for Temporal Action Localization

no code implementations9 Mar 2021 Xin Qin, Hanbin Zhao, Guangchen Lin, Hao Zeng, Songcen Xu, Xi Li

In this paper, we propose a temporal-position-sensitive context modeling approach to incorporate both positional and semantic information for more precise action localization.

Boundary Detection Position +3

Unsupervised Domain Adaptation for Image Classification via Structure-Conditioned Adversarial Learning

no code implementations4 Mar 2021 Hui Wang, Jian Tian, Songyuan Li, Hanbin Zhao, Qi Tian, Fei Wu, Xi Li

Unsupervised domain adaptation (UDA) typically carries out knowledge transfer from a label-rich source domain to an unlabeled target domain by adversarial learning.

General Classification Image Classification +2

Blockchain-empowered Data-driven Networks: A Survey and Outlook

no code implementations29 Jan 2021 Xi Li, Zehua Wang, Victor C. M. Leung, Hong Ji, Yiming Liu, Heli Zhang

The paths leading to future networks are pointing towards a data-driven paradigm to better cater to the explosive growth of mobile services as well as the increasing heterogeneity of mobile devices, many of which generate and consume large volumes and variety of data.

Networking and Internet Architecture

VersatileGait: A Large-Scale Synthetic Gait Dataset with Fine-GrainedAttributes and Complicated Scenarios

no code implementations5 Jan 2021 Huanzhang Dou, Wenhu Zhang, Pengyi Zhang, Yuhan Zhao, Songyuan Li, Zequn Qin, Fei Wu, Lin Dong, Xi Li

With the motivation of practical gait recognition applications, we propose to automatically create a large-scale synthetic gait dataset (called VersatileGait) by a game engine, which consists of around one million silhouette sequences of 11, 000 subjects with fine-grained attributes in various complicated scenarios.

Gait Recognition

RDI-Net: Relational Dynamic Inference Networks

1 code implementation ICCV 2021 Huanyu Wang, Songyuan Li, Shihao Su, Zequn Qin, Xi Li

In this paper, we model the relations for dynamic inference from two aspects: the routers and the samples.

Computational Efficiency Relation

FcaNet: Frequency Channel Attention Networks

7 code implementations ICCV 2021 Zequn Qin, Pengyi Zhang, Fei Wu, Xi Li

With the proof, we naturally generalize the compression of the channel attention mechanism in the frequency domain and propose our method with multi-spectral channel attention, termed as FcaNet.

Image Classification Instance Segmentation +3

Learning to Generate Content-Aware Dynamic Detectors

no code implementations8 Dec 2020 Junyi Feng, Jiashen Hua, Baisheng Lai, Jianqiang Huang, Xi Li, Xian-Sheng Hua

To the best of our knowledge, our CADDet is the first work to introduce dynamic routing mechanism in object detection.

Computational Efficiency Object +2

TextRay: Contour-based Geometric Modeling for Arbitrary-shaped Scene Text Detection

1 code implementation11 Aug 2020 Fangfang Wang, Yifeng Chen, Fei Wu, Xi Li

Arbitrary-shaped text detection is a challenging task due to the complex geometric layouts of texts such as large aspect ratios, various scales, random rotations and curve shapes.

Scene Text Detection Text Detection

Memory Efficient Class-Incremental Learning for Image Classification

no code implementations4 Aug 2020 Hanbin Zhao, Hui Wang, Yongjian Fu, Fei Wu, Xi Li

To cope with the forgetting problem, many CIL methods transfer the knowledge of old classes by preserving some exemplar samples into the size-constrained memory buffer.

Classification Class Incremental Learning +4

What and Where: Learn to Plug Adapters via NAS for Multi-Domain Learning

no code implementations24 Jul 2020 Hanbin Zhao, Hao Zeng, Xin Qin, Yongjian Fu, Hui Wang, Bourahla Omar, Xi Li

As an important and challenging problem, multi-domain learning (MDL) typically seeks for a set of effective lightweight domain-specific adapter modules plugged into a common domain-agnostic network.

Neural Architecture Search

A Multi-Semantic Metapath Model for Large Scale Heterogeneous Network Representation Learning

no code implementations19 Jul 2020 Xuandong Zhao, Jinbao Xue, Jin Yu, Xi Li, Hongxia Yang

In real-world applications, networks usually consist of billions of various types of nodes and edges with abundant attributes.

Link Prediction Network Embedding

Multitask Non-Autoregressive Model for Human Motion Prediction

no code implementations13 Jul 2020 Bin Li, Jian Tian, Zhongfei Zhang, Hailin Feng, Xi Li

Human motion prediction, which aims at predicting future human skeletons given the past ones, is a typical sequence-to-sequence problem.

Action Recognition Human motion prediction +2

MgSvF: Multi-Grained Slow vs. Fast Framework for Few-Shot Class-Incremental Learning

no code implementations28 Jun 2020 Hanbin Zhao, Yongjian Fu, Mintong Kang, Qi Tian, Fei Wu, Xi Li

As a challenging problem, few-shot class-incremental learning (FSCIL) continually learns a sequence of tasks, confronting the dilemma between slow forgetting of old knowledge and fast adaptation to new knowledge.

Few-Shot Class-Incremental Learning Incremental Learning

Epoch-evolving Gaussian Process Guided Learning

1 code implementation25 Jun 2020 Jiabao Cui, XueWei Li, Bin Li, Hanbin Zhao, Bourahla Omar, Xi Li

In this paper, we propose a novel learning scheme called epoch-evolving Gaussian Process Guided Learning (GPGL), which aims at characterizing the correlation information between the batch-level distribution and the global data distribution.

A Survey on Generative Adversarial Networks: Variants, Applications, and Training

no code implementations9 Jun 2020 Abdul Jabbar, Xi Li, Bourahla Omar

We survey, (I) the original GAN model and its modified classical versions, (II) detail analysis of various GAN applications in different domains, (III) detail study about the various GAN training obstacles as well as training solutions.

ResKD: Residual-Guided Knowledge Distillation

no code implementations8 Jun 2020 Xuewei Li, Songyuan Li, Bourahla Omar, Fei Wu, Xi Li

In this paper, we see knowledge distillation in a fresh light, using the knowledge gap, or the residual, between a teacher and a student as guidance to train a much more lightweight student, called a res-student.

Knowledge Distillation

CoDiNet: Path Distribution Modeling with Consistency and Diversity for Dynamic Routing

1 code implementation29 May 2020 Huanyu Wang, Zequn Qin, Songyuan Li, Xi Li

In this paper, we see dynamic routing networks in a fresh light, formulating a routing method as a mapping from a sample space to a routing space.

Model Compression

Unsupervised segmentation via semantic-apparent feature fusion

no code implementations21 May 2020 Xi Li, Huimin Ma, Hongbing Ma, Yidong Wang

In order to solve this problem, the research proposes an unsupervised foreground segmentation method based on semantic-apparent feature fusion (SAFF).

Foreground Segmentation Segmentation

Tamed Warping Network for High-Resolution Semantic Video Segmentation

no code implementations4 May 2020 Songyuan Li, Junyi Feng, Xi Li

Based on the feature fusion, our Context Feature Rectification~(CFR) module learns the model's difference from a per-frame model to correct the warped features.

Motion Estimation Real-Time Semantic Segmentation +2

Semantic Neighborhood-Aware Deep Facial Expression Recognition

no code implementations27 Apr 2020 Yongjian Fu, Xintian Wu, Xi Li, Zhijie Pan, Daxin Luo

Different from many other attributes, facial expression can change in a continuous way, and therefore, a slight semantic change of input should also lead to the output fluctuation limited in a small scale.

Facial Expression Recognition Facial Expression Recognition (FER)

Ultra Fast Structure-aware Deep Lane Detection

8 code implementations ECCV 2020 Zequn Qin, Huanyu Wang, Xi Li

Modern methods mainly regard lane detection as a problem of pixel-wise segmentation, which is struggling to address the problem of challenging scenarios and speed.

Lane Detection

Progressive Multi-Stage Learning for Discriminative Tracking

no code implementations1 Apr 2020 Weichao Li, Xi Li, Omar Elfarouk Bourahla, Fuxian Huang, Fei Wu, Wei Liu, Zhiheng Wang, Hongmin Liu

Visual tracking is typically solved as a discriminative learning problem that usually requires high-quality samples for online model adaptation.

Visual Tracking

Real-Time Semantic Segmentation via Auto Depth, Downsampling Joint Decision and Feature Aggregation

no code implementations31 Mar 2020 Peng Sun, Jiaxiang Wu, Songyuan Li, Peiwen Lin, Junzhou Huang, Xi Li

To satisfy the stringent requirements on computational resources in the field of real-time semantic segmentation, most approaches focus on the hand-crafted design of light-weight segmentation networks.

Neural Architecture Search Real-Time Semantic Segmentation +1

WSOD with PSNet and Box Regression

no code implementations26 Nov 2019 Sheng Yi, Xi Li, Huimin Ma

To solve this problem, we added the box regression module to the weakly supervised object detection network and proposed a proposal scoring network (PSNet) to supervise it.

Object object-detection +3

Graph-guided Architecture Search for Real-time Semantic Segmentation

1 code implementation CVPR 2020 Peiwen Lin, Peng Sun, Guangliang Cheng, Sirui Xie, Xi Li, Jianping Shi

Unlike previous works that use a simplified search space and stack a repeatable cell to form a network, we introduce a novel search mechanism with new search space where a lightweight model can be effectively explored through the cell-level diversity and latencyoriented constraint.

Real-Time Semantic Segmentation

Adaptive Graph Representation Learning for Video Person Re-identification

1 code implementation5 Sep 2019 Yiming Wu, Omar El Farouk Bourahla, Xi Li, Fei Wu, Qi Tian, Xue Zhou

While correlations between parts are ignored in the previous methods, to leverage the relations of different parts, we propose an innovative adaptive graph representation learning scheme for video person Re-ID, which enables the contextual interactions between relevant regional features.

Graph Representation Learning Video-Based Person Re-Identification

OVSNet : Towards One-Pass Real-Time Video Object Segmentation

no code implementations24 May 2019 Peng Sun, Peiwen Lin, Guangliang Cheng, Jianping Shi, Jiawan Zhang, Xi Li

Video object segmentation aims at accurately segmenting the target object regions across consecutive frames.

Object object-detection +6

On a caching system with object sharing

1 code implementation18 May 2019 George Kesidis, Nader Alfares, Xi Li, Bhuvan Urgaonkar, Mahmut Kandemir, Takis Konstantopoulos

We consider a content-caching system thatis shared by a number of proxies.

Performance Networking and Internet Architecture

Deep Q Learning Driven CT Pancreas Segmentation with Geometry-Aware U-Net

no code implementations19 Apr 2019 Yunze Man, Yangsibo Huang, Junyi Feng, Xi Li, Fei Wu

Segmentation of pancreas is important for medical image analysis, yet it faces great challenges of class imbalance, background distractions and non-rigid geometrical features.

Pancreas Segmentation Q-Learning +1

Perceiving Physical Equation by Observing Visual Scenarios

no code implementations29 Nov 2018 Siyu Huang, Zhi-Qi Cheng, Xi Li, Xiao Wu, Zhongfei Zhang, Alexander Hauptmann

To tackle this challenge, we present a novel pipeline comprised of an Observer Engine and a Physicist Engine by respectively imitating the actions of an observer and a physicist in the real world.

Context-Aware Deep Spatio-Temporal Network for Hand Pose Estimation from Depth Images

no code implementations6 Oct 2018 Yiming Wu, Wei Ji, Xi Li, Gang Wang, Jianwei Yin, Fei Wu

As a fundamental and challenging problem in computer vision, hand pose estimation aims to estimate the hand joint locations from depth images.

Hand Pose Estimation

Stacked Pooling: Improving Crowd Counting by Boosting Scale Invariance

1 code implementation22 Aug 2018 Siyu Huang, Xi Li, Zhi-Qi Cheng, Zhongfei Zhang, Alexander Hauptmann

In this work, we explore the cross-scale similarity in crowd counting scenario, in which the regions of different scales often exhibit high visual similarity.

Crowd Counting Density Estimation

Weakly-Supervised Semantic Segmentation by Iteratively Mining Common Object Features

no code implementations CVPR 2018 Xiang Wang, ShaoDi You, Xi Li, Huimin Ma

Then in the top-down step, the refined object regions are used as supervision to train the segmentation network and to predict object masks.

General Classification Object +3

Geometry-Aware Scene Text Detection With Instance Transformation Network

no code implementations CVPR 2018 Fangfang Wang, Liming Zhao, Xi Li, Xinchao Wang, DaCheng Tao

Localizing text in the wild is challenging in the situations of complicated geometric layout of the targets like random orientation and large aspect ratio.

General Classification Multi-Task Learning +5

State Distribution-aware Sampling for Deep Q-learning

no code implementations23 Apr 2018 Weichao Li, Fuxian Huang, Xi Li, Gang Pan, Fei Wu

A critical and challenging problem in reinforcement learning is how to learn the state-action value function from the experience replay buffer and simultaneously keep sample efficiency and faster convergence to a high quality solution.

Atari Games OpenAI Gym +1

GNAS: A Greedy Neural Architecture Search Method for Multi-Attribute Learning

no code implementations19 Apr 2018 Siyu Huang, Xi Li, Zhi-Qi Cheng, Zhongfei Zhang, Alexander Hauptmann

A key problem in deep multi-attribute learning is to effectively discover the inter-attribute correlation structures.

Attribute Neural Architecture Search

Multi-Channel Pyramid Person Matching Network for Person Re-Identification

no code implementations7 Mar 2018 Chaojie Mao, Yingming Li, Yaqing Zhang, Zhongfei Zhang, Xi Li

In particular, we learn separate deep representations for semantic-components and color-texture distributions from two person images and then employ pyramid person matching network (PPMN) to obtain correspondence representations.

Person Re-Identification

Pyramid Person Matching Network for Person Re-identification

no code implementations7 Mar 2018 Chaojie Mao, Yingming Li, Zhongfei Zhang, Yaqing Zhang, Xi Li

In this work, we present a deep convolutional pyramid person matching network (PPMN) with specially designed Pyramid Matching Module to address the problem of person re-identification.

Person Re-Identification

DR-Net: Transmission Steered Single Image Dehazing Network with Weakly Supervised Refinement

no code implementations2 Dec 2017 Chongyi Li, Jichang Guo, Fatih Porikli, Chunle Guo, Huzhu Fu, Xi Li

Despite the recent progress in image dehazing, several problems remain largely unsolved such as robustness for varying scenes, the visual quality of reconstructed images, and effectiveness and flexibility for applications.

Image Dehazing Single Image Dehazing +1

Deep Air Learning: Interpolation, Prediction, and Feature Analysis of Fine-grained Air Quality

no code implementations2 Nov 2017 Zhongang Qi, Tianchun Wang, Guojie Song, Weisong Hu, Xi Li, Zhongfei, Zhang

The interpolation, prediction, and feature analysis of fine-gained air quality are three important topics in the area of urban air computing.

feature selection

Boosted Zero-Shot Learning with Semantic Correlation Regularization

no code implementations25 Jul 2017 Te Pi, Xi Li, Zhongfei, Zhang

For adaptable knowledge transfer, we devise a Semantic Correlation Regularization (SCR) approach to regularize the boosted model to be consistent with the inter-class semantic correlations.

Model Selection Transfer Learning +1

Graph-Theoretic Spatiotemporal Context Modeling for Video Saliency Detection

no code implementations25 Jul 2017 Lina Wei, Fangfang Wang, Xi Li, Fei Wu, Jun Xiao

As a result, a key issue in video saliency detection is how to effectively capture the intrinsical properties of atomic video structures as well as their associated contextual interactions along the spatial and temporal dimensions.

Video Saliency Detection

Group-wise Deep Co-saliency Detection

no code implementations24 Jul 2017 Lina Wei, Shanshan Zhao, Omar El Farouk Bourahla, Xi Li, Fei Wu

In this paper, we propose an end-to-end group-wise deep co-saliency detection approach to address the co-salient object discovery problem based on the fully convolutional network (FCN) with group input and group output.

Co-Salient Object Detection Object Discovery +1

Deep Optical Flow Estimation Via Multi-Scale Correspondence Structure Learning

no code implementations23 Jul 2017 Shanshan Zhao, Xi Li, Omar El Farouk Bourahla

Therefore, a key issue to solve in this area is how to effectively model the multi-scale correspondence structure properties in an adaptive end-to-end learning fashion.

Optical Flow Estimation

Deeply-Learned Part-Aligned Representations for Person Re-Identification

1 code implementation ICCV 2017 Liming Zhao, Xi Li, Jingdong Wang, Yueting Zhuang

In this paper, we address the problem of person re-identification, which refers to associating the persons captured from different cameras.

Person Re-Identification

Transductive Zero-Shot Learning with a Self-training dictionary approach

no code implementations27 Mar 2017 Yunlong Yu, Zhong Ji, Xi Li, Jichang Guo, Zhongfei Zhang, Haibin Ling, Fei Wu

As an important and challenging problem in computer vision, zero-shot learning (ZSL) aims at automatically recognizing the instances from unseen object classes without training data.

Transductive Learning Transfer Learning +1

Deep Convolutional Neural Networks with Merge-and-Run Mappings

4 code implementations23 Nov 2016 Liming Zhao, Jingdong Wang, Xi Li, Zhuowen Tu, Wen-Jun Zeng

A deep residual network, built by stacking a sequence of residual blocks, is easy to train, because identity mappings skip residual branches and thus improve information flow.

DLAU: A Scalable Deep Learning Accelerator Unit on FPGA

no code implementations23 May 2016 Chao Wang, Qi Yu, Lei Gong, Xi Li, Yuan Xie, Xuehai Zhou

As the emerging field of machine learning, deep learning shows excellent ability in solving complex learning problems.

Deep Learning Driven Visual Path Prediction from a Single Image

no code implementations27 Jan 2016 Siyu Huang, Xi Li, Zhongfei Zhang, Zhouzhou He, Fei Wu, Wei Liu, Jinhui Tang, Yueting Zhuang

The highly effective visual representation and deep context models ensure that our framework makes a deep semantic understanding of the scene and motion pattern, consequently improving the performance of the visual path prediction task.

3D Hand Pose Estimation Using Randomized Decision Forest With Segmentation Index Points

no code implementations ICCV 2015 Peiyi Li, Haibin Ling, Xi Li, Chunyuan Liao

In this paper, we propose a real-time 3D hand pose estimation algorithm using the randomized decision forest framework.

3D Hand Pose Estimation

DeepSaliency: Multi-Task Deep Neural Network Model for Salient Object Detection

no code implementations19 Oct 2015 Xi Li, Liming Zhao, Lina Wei, Ming-Hsuan Yang, Fei Wu, Yueting Zhuang, Haibin Ling, Jingdong Wang

A key problem in salient object detection is how to effectively model the semantic properties of salient objects in a data-driven manner.

Image Segmentation Multi-Task Learning +6

Online Metric-Weighted Linear Representations for Robust Visual Tracking

no code implementations21 Jul 2015 Xi Li, Chunhua Shen, Anthony Dick, Zhongfei Zhang, Yueting Zhuang

Object identification results for an entire video sequence are achieved by systematically combining the tracking information and visual recognition at each frame.

Metric Learning Object +2

Metric Learning Driven Multi-Task Structured Output Optimization for Robust Keypoint Tracking

no code implementations4 Dec 2014 Liming Zhao, Xi Li, Jun Xiao, Fei Wu, Yueting Zhuang

As an important and challenging problem in computer vision and graphics, keypoint-based object tracking is typically formulated in a spatio-temporal statistical learning framework.

Metric Learning Object Tracking

Contextual Hypergraph Modelling for Salient Object Detection

no code implementations22 Oct 2013 Xi Li, Yao Li, Chunhua Shen, Anthony Dick, Anton Van Den Hengel

In this work, we model an image as a hypergraph that utilizes a set of hyperedges to capture the contextual properties of image pixels or regions.

Object object-detection +2

Learning Compact Binary Codes for Visual Tracking

no code implementations CVPR 2013 Xi Li, Chunhua Shen, Anthony Dick, Anton Van Den Hengel

A key problem in visual tracking is to represent the appearance of an object in a way that is robust to visual changes.

Visual Tracking

Cannot find the paper you are looking for? You can Submit a new open access paper.