Search Results for author: Xi Li

Found 123 papers, 34 papers with code

Joint Physical-Digital Facial Attack Detection Via Simulating Spoofing Clues

2 code implementations • 12 Apr 2024 • Xianhua He, Dashuang Liang, Song Yang, Zhanlong Hao, Hui Ma, Binjie Mao, Xi Li, Yao Wang, Pengfei Yan, Ajian Liu

SPSC and SDSC augment live samples into simulated attack samples by simulating spoofing clues of physical and digital attacks, respectively, which significantly improve the capability of the model to detect "unseen" attack types.

Data Augmentation Face Anti-Spoofing +1

205

Paper
Code

SphereDiffusion: Spherical Geometry-Aware Distortion Resilient Diffusion Model

no code implementations • 15 Mar 2024 • Tao Wu, XueWei Li, Zhongang Qi, Di Hu, Xintao Wang, Ying Shan, Xi Li

Controllable spherical panoramic image generation holds substantial applicative potential across a variety of domains. However, it remains a challenging task due to the inherent spherical distortion and geometry characteristics, resulting in low-quality content generation. In this paper, we introduce a novel framework of SphereDiffusion to address these unique challenges, for better generating high-quality and precisely controllable spherical panoramic images. For the spherical distortion characteristic, we embed the semantics of the distorted object with text encoding, then explicitly construct the relationship with text-object correspondence to better use the pre-trained knowledge of the planar images. Meanwhile, we employ a deformable technique to mitigate the semantic deviation in latent space caused by spherical distortion. For the spherical geometry characteristic, in virtue of spherical rotation invariance, we improve the data diversity and optimization objectives in the training process, enabling the model to better learn the spherical geometry characteristic. Furthermore, we enhance the denoising process of the diffusion model, enabling it to effectively use the learned geometric characteristic to ensure the boundary continuity of the generated images. With these specific techniques, experiments on Structured3D dataset show that SphereDiffusion significantly improves the quality of controllable spherical image generation and relatively reduces around 35% FID on average.

Denoising Image Generation

Paper
Add Code

Personalized Behavior-Aware Transformer for Multi-Behavior Sequential Recommendation

1 code implementation • 22 Feb 2024 • Jiajie Su, Chaochao Chen, Zibin Lin, Xi Li, Weiming Liu, Xiaolin Zheng

To tackle these challenges, we propose a Personalized Behavior-Aware Transformer framework (PBAT) for MBSR problem, which models personalized patterns and multifaceted sequential collaborations in a novel way to boost recommendation performance.

Sequential Recommendation

Paper
Code

Universal Post-Training Reverse-Engineering Defense Against Backdoors in Deep Neural Networks

no code implementations • 3 Feb 2024 • Xi Li, Hang Wang, David J. Miller, George Kesidis

A variety of defenses have been proposed against backdoors attacks on deep neural network (DNN) classifiers.

Paper
Add Code

Position Paper: Assessing Robustness, Privacy, and Fairness in Federated Learning Integrated with Foundation Models

no code implementations • 2 Feb 2024 • Xi Li, Jiaqi Wang

Federated Learning (FL), while a breakthrough in decentralized machine learning, contends with significant challenges such as limited data availability and the variability of computational resources, which can stifle the performance and scalability of the models.

Data Augmentation Fairness +2

Paper
Add Code

Vulnerabilities of Foundation Model Integrated Federated Learning Under Adversarial Threats

no code implementations • 18 Jan 2024 • Chen Wu, Xi Li, Jiaqi Wang

Federated Learning (FL) addresses critical issues in machine learning related to data privacy and security, yet suffering from data insufficiency and imbalance under certain circumstances.

Federated Learning

Paper
Add Code

TextFusion: Unveiling the Power of Textual Semantics for Controllable Image Fusion

1 code implementation • 21 Dec 2023 • Chunyang Cheng, Tianyang Xu, Xiao-Jun Wu, Hui Li, Xi Li, Zhangyong Tang, Josef Kittler

Advanced image fusion methods are devoted to generating the fusion results by aggregating the complementary information conveyed by the source images.

Image Quality Assessment Language Modelling

Paper
Code

Prompting Segmentation with Sound Is Generalizable Audio-Visual Source Localizer

1 code implementation • 13 Sep 2023 • Yaoting Wang, Weisong Liu, Guangyao Li, Jian Ding, Di Hu, Xi Li

Never having seen an object and heard its sound simultaneously, can the model still accurately localize its visual position from the input audio?

CoLA Visual Localization

Paper
Code

Bridging Cross-task Protocol Inconsistency for Distillation in Dense Object Detection

1 code implementation • ICCV 2023 • Longrong Yang, Xianpan Zhou, XueWei Li, Liang Qiao, Zheyang Li, Ziwei Yang, Gaoang Wang, Xi Li

Thus, the optimum of the distillation loss does not necessarily lead to the optimal student classification scores for dense object detectors.

Binary Classification Classification +4

Paper
Code

Temporal-Distributed Backdoor Attack Against Video Based Action Recognition

no code implementations • 21 Aug 2023 • Xi Li, Songhe Wang, Ruiquan Huang, Mahanth Gowda, George Kesidis

Although there are extensive studies on backdoor attacks against image data, the susceptibility of video-based systems under backdoor attacks remains largely unexplored.

Action Recognition Backdoor Attack +3

Paper
Add Code

Backdoor Mitigation by Correcting the Distribution of Neural Activations

no code implementations • 18 Aug 2023 • Xi Li, Zhen Xiang, David J. Miller, George Kesidis

Backdoor (Trojan) attacks are an important type of adversarial exploit against deep neural networks (DNNs), wherein a test instance is (mis)classified to the attacker's target class whenever the attacker's backdoor trigger is present.

Paper
Add Code

HeightFormer: Explicit Height Modeling without Extra Data for Camera-only 3D Object Detection in Bird's Eye View

no code implementations • 25 Jul 2023 • Yiming Wu, Ruixiang Li, Zequn Qin, Xinhai Zhao, Xi Li

In this work, we propose to explicitly model heights in the BEV space, which needs no extra data like LiDAR and can fit arbitrary camera rigs and types compared to modeling depths.

3D Object Detection Autonomous Driving +1

Paper
Add Code

Referring Expression Comprehension Using Language Adaptive Inference

1 code implementation • 6 Jun 2023 • Wei Su, Peihan Miao, Huanzhang Dou, Yongjian Fu, Xi Li

Different from universal object detection, referring expression comprehension (REC) aims to locate specific objects referred to by natural language expressions.

object-detection Object Detection +2

Paper
Code

DenseDINO: Boosting Dense Self-Supervised Learning with Token-Based Point-Level Consistency

no code implementations • 6 Jun 2023 • Yike Yuan, Xinghe Fu, Yunlong Yu, Xi Li

In this paper, we propose a simple yet effective transformer framework for self-supervised learning called DenseDINO to learn dense visual representations.

Position Segmentation +2

Paper
Add Code

GaitGCI: Generative Counterfactual Intervention for Gait Recognition

no code implementations • CVPR 2023 • Huanzhang Dou, Pengyi Zhang, Wei Su, Yunlong Yu, Yining Lin, Xi Li

Gait is one of the most promising biometrics that aims to identify pedestrians from their walking patterns.

counterfactual Gait Recognition

Paper
Add Code

SGAT4PASS: Spherical Geometry-Aware Transformer for PAnoramic Semantic Segmentation

1 code implementation • 6 Jun 2023 • XueWei Li, Tao Wu, Zhongang Qi, Gaoang Wang, Ying Shan, Xi Li

Experimental results on Stanford2D3D Panoramic datasets show that SGAT4PASS significantly improves performance and robustness, with approximately a 2% increase in mIoU, and when small 3D disturbances occur in the data, the stability of our performance is improved by an order of magnitude.

Ranked #4 on Semantic Segmentation on Stanford2D3D Panoramic

Semantic Segmentation

Paper
Code

MetaGait: Learning to Learn an Omni Sample Adaptive Representation for Gait Recognition

no code implementations • 6 Jun 2023 • Huanzhang Dou, Pengyi Zhang, Wei Su, Yunlong Yu, Xi Li

Towards this goal, MetaGait injects meta-knowledge, which could guide the model to perceive sample-specific properties, into the calibration network of the attention mechanism to improve the adaptiveness from the omni-scale, omni-dimension, and omni-process perspectives.

Gait Recognition

Paper
Add Code

GaitMPL: Gait Recognition with Memory-Augmented Progressive Learning

no code implementations • 6 Jun 2023 • Huanzhang Dou, Pengyi Zhang, Yuhan Zhao, Lin Dong, Zequn Qin, Xi Li

In this work, we propose to solve the hard sample issue with a Memory-augmented Progressive Learning network (GaitMPL), including Dynamic Reweighting Progressive Learning module (DRPL) and Global Structure-Aligned Memory bank (GSAM).

Gait Recognition

Paper
Add Code

Language Adaptive Weight Generation for Multi-task Visual Grounding

1 code implementation • CVPR 2023 • Wei Su, Peihan Miao, Huanzhang Dou, Gaoang Wang, Liang Qiao, Zheyang Li, Xi Li

The active perception can take expressions as priors to extract relevant visual features, which can effectively alleviate the mismatches.

Referring Expression Referring Expression Comprehension +1

Paper
Code

A Deep RL Approach on Task Placement and Scaling of Edge Resources for Cellular Vehicle-to-Network Service Provisioning

no code implementations • 16 May 2023 • Cyril Shih-Huan Hsu, Jorge Martín-Pérez, Danny De Vleeschauwer, Koteswararao Kondepu, Luca Valcarenghi, Xi Li, Chrysa Papagianni

By conducting a complexity analysis, we prove that DDPG-based solutions achieve runtimes in the range of sub-milliseconds, meeting the strict latency requirements of C-V2N services.

Decision Making

Paper
Add Code

FusionBooster: A Unified Image Fusion Boosting Paradigm

1 code implementation • 10 May 2023 • Chunyang Cheng, Tianyang Xu, Xiao-Jun Wu, Hui Li, Xi Li, Josef Kittler

We argue that there is a scope to improve the fusion performance with the help of the FusionBooster, a model specifically designed for the fusion task.

Paper
Code

LayoutDiffusion: Controllable Diffusion Model for Layout-to-image Generation

2 code implementations • CVPR 2023 • Guangcong Zheng, Xianpan Zhou, XueWei Li, Zhongang Qi, Ying Shan, Xi Li

To overcome the difficult multimodal fusion of image and layout, we propose to construct a structural image patch with region information and transform the patched image into a special layout to fuse with the normal layout in a unified form.

Ranked #1 on Layout-to-Image Generation on Visual Genome 128x128

Layout-to-Image Generation Object

222

Paper
Code

PUPS: Point Cloud Unified Panoptic Segmentation

no code implementations • 13 Feb 2023 • Shihao Su, Jianyun Xu, Huanyu Wang, Zhenwei Miao, Xin Zhan, Dayang Hao, Xi Li

Point cloud panoptic segmentation is a challenging task that seeks a holistic solution for both semantic and instance segmentation to predict groupings of coherent points.

Instance Segmentation Panoptic Segmentation +1

Paper
Add Code

RWSC-Fusion: Region-Wise Style-Controlled Fusion Network for the Prohibited X-Ray Security Image Synthesis

no code implementations • CVPR 2023 • Luwen Duan, Min Wu, Lijian Mao, Jun Yin, Jianping Xiong, Xi Li

Automatic prohibited item detection in security inspection X-ray images is necessary for transportation. The abundance and diversity of the X-ray security images with prohibited item, termed as prohibited X-ray security images, are essential for training the detection model.

Image Generation

Paper
Add Code

DeSTSeg: Segmentation Guided Denoising Student-Teacher for Anomaly Detection

1 code implementation • CVPR 2023 • Xuan Zhang, Shiyu Li, Xi Li, Ping Huang, Jiulong Shan, Ting Chen

In this study, we propose an improved model called DeSTSeg, which integrates a pre-trained teacher network, a denoising student encoder-decoder, and a segmentation network into one framework.

Ranked #35 on Anomaly Detection on MVTec AD

Denoising One-Class Classification +1

Paper
Code

Adaptive Edge-to-Edge Interaction Learning for Point Cloud Analysis

no code implementations • 20 Nov 2022 • Shanshan Zhao, Mingming Gong, Xi Li, DaCheng Tao

To explore the role of the relation between edges, this paper proposes a novel Adaptive Edge-to-Edge Interaction Learning module, which aims to enhance the point-to-point relation through modelling the edge-to-edge interaction in the local region adaptively.

Relation Semantic Segmentation

Paper
Add Code

Adma-GAN: Attribute-Driven Memory Augmented GANs for Text-to-Image Generation

no code implementations • 28 Sep 2022 • Xintian Wu, Hanbin Zhao, Liangli Zheng, Shouhong Ding, Xi Li

Existing methods mainly extract the text information from only one sentence to represent an image and the text representation effects the quality of the generated image well.

Attribute Sentence +1

Paper
Add Code

UniFusion: Unified Multi-view Fusion Transformer for Spatial-Temporal Representation in Bird's-Eye-View

2 code implementations • ICCV 2023 • Zequn Qin, Jingyu Chen, Chao Chen, Xiaozhi Chen, Xi Li

Bird's eye view (BEV) representation is a new perception formulation for autonomous driving, which is based on spatial fusion.

Autonomous Driving

Paper
Code

Dynamic Low-Resolution Distillation for Cost-Efficient End-to-End Text Spotting

1 code implementation • 14 Jul 2022 • Ying Chen, Liang Qiao, Zhanzhan Cheng, ShiLiang Pu, Yi Niu, Xi Li

In this paper, to address this problem, we propose a novel cost-efficient Dynamic Low-resolution Distillation (DLD) text spotting framework, which aims to infer images in different small but recognizable resolutions and achieve a better balance between accuracy and efficiency.

Knowledge Distillation Optical Character Recognition (OCR) +1

708

Paper
Code

Entropy-driven Sampling and Training Scheme for Conditional Diffusion Generation

1 code implementation • 23 Jun 2022 • Shengming Li, Guangcong Zheng, Hui Wang, Taiping Yao, Yang Chen, Shoudong Ding, Xi Li

Denoising Diffusion Probabilistic Model (DDPM) is able to make flexible conditional image generation from prior noise to real data, by introducing an independent noise-aware classifier to provide conditional gradient guidance at each time step of denoising process.

Ranked #1 on Conditional Image Generation on ImageNet 256x256

Conditional Image Generation Denoising

Paper
Code

MonoGround: Detecting Monocular 3D Objects from the Ground

1 code implementation • CVPR 2022 • Zequn Qin, Xi Li

To alleviate this problem, we propose to introduce the ground plane as a prior in the monocular 3d object detection.

Depth Estimation Monocular 3D Object Detection +2

Paper
Code

Ultra Fast Deep Lane Detection with Hybrid Anchor Driven Ordinal Classification

2 code implementations • 15 Jun 2022 • Zequn Qin, Pengyi Zhang, Xi Li

With the help of the anchor-driven representation, we then reformulate the lane detection task as an ordinal classification problem to get the coordinates of lanes.

Lane Detection Ordinal Classification

519

Paper
Code

D3T-GAN: Data-Dependent Domain Transfer GANs for Few-shot Image Generation

no code implementations • 12 May 2022 • Xintian Wu, Huanyu Wang, Yiming Wu, Xi Li

To transfer knowledge between discriminators, we design a multi-level discriminant knowledge distillation from the source discriminator to the target discriminator on both the real and fake samples.

Image Generation Knowledge Distillation +1

Paper
Add Code

F3A-GAN: Facial Flow for Face Animation with Generative Adversarial Networks

no code implementations • 12 May 2022 • Xintian Wu, Qihang Zhang, Yiming Wu, Huanyu Wang, Songyuan Li, Lingyun Sun, Xi Li

Formulated as a conditional generation problem, face animation aims at synthesizing continuous face images from a single source image driven by a set of conditional face motion.

Paper
Add Code

Pricing Path-dependent Options under Stochastic Volatility via Mellin Transform

no code implementations • 1 May 2022 • Jiling Cao, Jeong-Hoon Kim, Xi Li, Wenjun Zhang

In this paper, we derive closed-form formulas of first-order approximation for down-and-out barrier and floating strike lookback put option prices under a stochastic volatility model, by using an asymptotic approach.

Paper
Add Code

MIPR:Automatic Annotation of Medical Images with Pixel Rearrangement

no code implementations • 22 Apr 2022 • Pingping Dai, Haiming Zhu, Shuang Ge, Ruihan Zhang, Xiang Qian, Xi Li, Kehong Yuan

In this paper, inspired by self-training of semi-supervised learning, we pro? pose a novel approach to solve the lack of annotated data from another angle, called medical image pixel rearrangement (short in MIPR).

Pseudo Label Segmentation +1

Paper
Add Code

Self-paced Multi-grained Cross-modal Interaction Modeling for Referring Expression Comprehension

no code implementations • 21 Apr 2022 • Peihan Miao, Wei Su, Gaoang Wang, XueWei Li, Xi Li

As an important and challenging problem in vision-language tasks, referring expression comprehension (REC) generally requires a large amount of multi-grained information of visual and linguistic modalities to realize accurate reasoning.

Informativeness Referring Expression +1

Paper
Add Code

RBC: Rectifying the Biased Context in Continual Semantic Segmentation

no code implementations • 16 Mar 2022 • Hanbin Zhao, Fengyu Yang, Xinghe Fu, Xi Li

In practice, new images are usually made available in a consecutive manner, leading to a problem called Continual Semantic Segmentation (CSS).

Continual Semantic Segmentation Segmentation +1

Paper
Add Code

A Review on Methods and Applications in Multimodal Deep Learning

no code implementations • 18 Feb 2022 • Jabeen Summaira, Xi Li, Amin Muhammad Shoib, Jabbar Abdul

Deep Learning has implemented a wide range of applications and has become increasingly popular in recent years.

Multimodal Deep Learning

Paper
Add Code

Bias-Eliminated Semantic Refinement for Any-Shot Learning

1 code implementation • 10 Feb 2022 • Liangjun Feng, Chunhui Zhao, Xi Li

When training samples are scarce, the semantic embedding technique, ie, describing class labels with attributes, provides a condition to generate visual features for unseen objects by transferring the knowledge from seen objects.

Few-Shot Learning Generalized Zero-Shot Learning +1

Paper
Code

Dual-Tasks Siamese Transformer Framework for Building Damage Assessment

no code implementations • 26 Jan 2022 • Hongruixuan Chen, Edoardo Nemni, Sofia Vallecorsa, Xi Li, Chen Wu, Lars Bromley

Considering the frontier advances of Transformer architecture in the computer vision field, in this paper, we present the first attempt at designing a Transformer-based damage assessment architecture (DamFormer).

Ranked #6 on Extracting Buildings In Remote Sensing Images on xBD

Disaster Response Extracting Buildings In Remote Sensing Images +1

Paper
Add Code

CFNet: Learning Correlation Functions for One-Stage Panoptic Segmentation

no code implementations • 13 Jan 2022 • Yifeng Chen, Wenqing Chu, Fangfang Wang, Ying Tai, Ran Yi, Zhenye Gan, Liang Yao, Chengjie Wang, Xi Li

Recently, there is growing attention on one-stage panoptic segmentation methods which aim to segment instances and stuff jointly within a fully convolutional pipeline efficiently.

Instance Segmentation Panoptic Segmentation +1

Paper
Add Code

Phocus: Picking Valuable Research from a Sea of Citations

no code implementations • 9 Jan 2022 • Xinrong Zhang, Zihou Ren, Xi Li, Shuqi Liu, Yunlong Deng, Yadi Xiao, Yuxing Han, Jiangtao Wen

The global influential factor of the reference to the citing paper is the product of the local influential factor and the total influential factor of the citing paper.

Sentence

Paper
Add Code

Test-Time Detection of Backdoor Triggers for Poisoned Deep Neural Networks

no code implementations • 6 Dec 2021 • Xi Li, Zhen Xiang, David J. Miller, George Kesidis

A DNN being attacked will predict to an attacker-desired target class whenever a test sample from any source class is embedded with a backdoor pattern; while correctly classifying clean (attack-free) test samples.

Backdoor Attack Image Classification

Paper
Add Code

Detecting Backdoor Attacks Against Point Cloud Classifiers

no code implementations • 20 Oct 2021 • Zhen Xiang, David J. Miller, Siheng Chen, Xi Li, George Kesidis

Backdoor attacks (BA) are an emerging threat to deep neural network classifiers.

Autonomous Driving

Paper
Add Code

MGH: Metadata Guided Hypergraph Modeling for Unsupervised Person Re-identification

1 code implementation • 12 Oct 2021 • Yiming Wu, Xintian Wu, Xi Li, Jian Tian

As a challenging task, unsupervised person ReID aims to match the same identity with query images which does not require any labeled information.

Unsupervised Person Re-Identification

Paper
Code

Backdoor Attack and Defense for Deep Regression

no code implementations • 6 Sep 2021 • Xi Li, George Kesidis, David J. Miller, Vladimir Lucic

We demonstrate a backdoor attack on a deep neural network used for regression.

Backdoor Attack backdoor defense +2

Paper
Add Code

Crypto Wash Trading

no code implementations • 24 Aug 2021 • Lin William Cong, Xi Li, Ke Tang, Yang Yang

We introduce systematic tests exploiting robust statistical and behavioral patterns in trading to detect fake transactions on 29 cryptocurrency exchanges.

Paper
Add Code

Robust and Active Learning for Deep Neural Network Regression

no code implementations • 28 Jul 2021 • Xi Li, George Kesidis, David J. Miller, Maxime Bergeron, Ryan Ferguson, Vladimir Lucic

We describe a gradient-based method to discover local error maximizers of a deep neural network (DNN) used for regression, assuming the availability of an "oracle" capable of providing real-valued supervision (a regression target) for samples.

Active Learning regression

Paper
Add Code

A Survey on Deep Domain Adaptation and Tiny Object Detection Challenges, Techniques and Datasets

no code implementations • 16 Jul 2021 • Muhammed Muzammul, Xi Li

At the end, we showed future directions with existing challenges of the field.

Data Augmentation Domain Adaptation +3

Paper
Add Code

Multitask Identity-Aware Image Steganography via Minimax Optimization

no code implementations • 13 Jul 2021 • Jiabao Cui, Pengyi Zhang, Songyuan Li, Liangli Zheng, Cuizhu Bao, Jupeng Xia, Xi Li

The key issue of the direct recognition is to preserve identity information of secret images into container images and make container images look similar to cover images at the same time.

Image Restoration Image Steganography

Paper
Add Code

When Video Classification Meets Incremental Classes

no code implementations • 30 Jun 2021 • Hanbin Zhao, Xin Qin, Shihao Su, Yongjian Fu, Zibo Lin, Xi Li

With the rapid development of social media, tremendous videos with new classes are generated daily, which raise an urgent demand for video classification methods that can continuously update new classes while maintaining the knowledge of old videos with limited storage and computing resources.

Classification Class Incremental Learning +3

Paper
Add Code

Progressive Class-based Expansion Learning For Image Classification

no code implementations • 28 Jun 2021 • Hui Wang, Hanbin Zhao, Xi Li

In this paper, we propose a novel image process scheme called class-based expansion learning for image classification, which aims at improving the supervision-stimulation frequency for the samples of the confusing classes.

Classification Image Classification

Paper
Add Code

Attend and select: A segment selective transformer for microblog hashtag generation

1 code implementation • 6 Jun 2021 • Qianren Mao, Xi Li, Bang Liu, Shu Guo, Peng Hao, JianXin Li, Lihong Wang

These tokens or phrases may originate from primary fragmental textual pieces (e. g., segments) in the original text and are separated into different segments.

Paper
Code

VersatileGait: A Large-Scale Synthetic Gait Dataset Towards in-the-Wild Simulation

no code implementations • 30 May 2021 • Pengyi Zhang, Huanzhang Dou, Wenhu Zhang, Yuhan Zhao, Songyuan Li, Zequn Qin, Xi Li

To diversify the extrinsic factors of gait, we build a complicated scene with a dense camera layout.

Gait Recognition in the Wild

Paper
Add Code

CNTLS: A Benchmark Dataset for Abstractive or Extractive Chinese Timeline Summarization

no code implementations • 29 May 2021 • Qianren Mao, Jiazheng Wang, Zheng Wang, Xi Li, Bo Li, JianXin Li

We meticulously analyze the corpus using well-known metrics, focusing on the style of the summaries and the complexity of the summarization task.

Information Retrieval Retrieval +3

Paper
Add Code

A BIC-based Mixture Model Defense against Data Poisoning Attacks on Classifiers

no code implementations • 28 May 2021 • Xi Li, David J. Miller, Zhen Xiang, George Kesidis

Data Poisoning (DP) is an effective attack that causes trained classifiers to misclassify their inputs.

Data Poisoning

Paper
Add Code

Recent Advances and Trends in Multimodal Deep Learning: A Review

no code implementations • 24 May 2021 • Jabeen Summaira, Xi Li, Amin Muhammad Shoib, Songyuan Li, Jabbar Abdul

Deep Learning has implemented a wide range of applications and has become increasingly popular in recent years.

Multimodal Deep Learning

Paper
Add Code

Video Frame Interpolation via Structure-Motion based Iterative Fusion

no code implementations • 11 May 2021 • Xi Li, Meng Cao, Yingying Tang, Scott Johnston, Zhendong Hong, Huimin Ma, Jiulong Shan

Inspired by the observation that audiences have different visual preferences on foreground and background objects, we for the first time propose to use saliency masks in the evaluation processes of the task of video frame interpolation.

Optical Flow Estimation Video Frame Interpolation

Paper
Add Code

Deep RGB-D Saliency Detection with Depth-Sensitive Attention and Automatic Multi-Modal Fusion

1 code implementation • CVPR 2021 • Peng Sun, Wenhu Zhang, Huanyu Wang, Songyuan Li, Xi Li

In principle, the feature modeling scheme is carried out in a depth-sensitive attention module, which leads to the RGB feature enhancement as well as the background distraction reduction by capturing the depth geometry prior.

object-detection RGB-D Salient Object Detection +2

Paper
Code

PcmNet: Position-Sensitive Context Modeling Network for Temporal Action Localization

no code implementations • 9 Mar 2021 • Xin Qin, Hanbin Zhao, Guangchen Lin, Hao Zeng, Songcen Xu, Xi Li

In this paper, we propose a temporal-position-sensitive context modeling approach to incorporate both positional and semantic information for more precise action localization.

Boundary Detection Position +3

Paper
Add Code

Unsupervised Domain Adaptation for Image Classification via Structure-Conditioned Adversarial Learning

no code implementations • 4 Mar 2021 • Hui Wang, Jian Tian, Songyuan Li, Hanbin Zhao, Qi Tian, Fei Wu, Xi Li

Unsupervised domain adaptation (UDA) typically carries out knowledge transfer from a label-rich source domain to an unlabeled target domain by adversarial learning.

General Classification Image Classification +2

Paper
Add Code

Efficient Style-Corpus Constrained Learning for Photorealistic Style Transfer

1 code implementation • IEEE 2021 • Yingxu Qiao, Jiabao Cui, Fuxian Huang, Hongmin Liu, Cuizhu Bao, Xi Li

Photorealistic style transfer is a challenging task, which demands the stylized image remains real.

Style Transfer

Paper
Code

Blockchain-empowered Data-driven Networks: A Survey and Outlook

no code implementations • 29 Jan 2021 • Xi Li, Zehua Wang, Victor C. M. Leung, Hong Ji, Yiming Liu, Heli Zhang

The paths leading to future networks are pointing towards a data-driven paradigm to better cater to the explosive growth of mobile services as well as the increasing heterogeneity of mobile devices, many of which generate and consume large volumes and variety of data.

Networking and Internet Architecture

Paper
Add Code

VersatileGait: A Large-Scale Synthetic Gait Dataset with Fine-GrainedAttributes and Complicated Scenarios

no code implementations • 5 Jan 2021 • Huanzhang Dou, Wenhu Zhang, Pengyi Zhang, Yuhan Zhao, Songyuan Li, Zequn Qin, Fei Wu, Lin Dong, Xi Li

With the motivation of practical gait recognition applications, we propose to automatically create a large-scale synthetic gait dataset (called VersatileGait) by a game engine, which consists of around one million silhouette sequences of 11, 000 subjects with fine-grained attributes in various complicated scenarios.

Gait Recognition

Paper
Add Code

RDI-Net: Relational Dynamic Inference Networks

1 code implementation • ICCV 2021 • Huanyu Wang, Songyuan Li, Shihao Su, Zequn Qin, Xi Li

In this paper, we model the relations for dynamic inference from two aspects: the routers and the samples.

Computational Efficiency Relation

Paper
Code

FcaNet: Frequency Channel Attention Networks

7 code implementations • ICCV 2021 • Zequn Qin, Pengyi Zhang, Fei Wu, Xi Li

With the proof, we naturally generalize the compression of the channel attention mechanism in the frequency domain and propose our method with multi-spectral channel attention, termed as FcaNet.

Image Classification Instance Segmentation +3

444

Paper
Code

Learning to Generate Content-Aware Dynamic Detectors

no code implementations • 8 Dec 2020 • Junyi Feng, Jiashen Hua, Baisheng Lai, Jianqiang Huang, Xi Li, Xian-Sheng Hua

To the best of our knowledge, our CADDet is the first work to introduce dynamic routing mechanism in object detection.

Computational Efficiency Object +2

Paper
Add Code

TextRay: Contour-based Geometric Modeling for Arbitrary-shaped Scene Text Detection

1 code implementation • 11 Aug 2020 • Fangfang Wang, Yifeng Chen, Fei Wu, Xi Li

Arbitrary-shaped text detection is a challenging task due to the complex geometric layouts of texts such as large aspect ratios, various scales, random rotations and curve shapes.

Scene Text Detection Text Detection

Paper
Code

Memory Efficient Class-Incremental Learning for Image Classification

no code implementations • 4 Aug 2020 • Hanbin Zhao, Hui Wang, Yongjian Fu, Fei Wu, Xi Li

To cope with the forgetting problem, many CIL methods transfer the knowledge of old classes by preserving some exemplar samples into the size-constrained memory buffer.

Classification Class Incremental Learning +4

Paper
Add Code

What and Where: Learn to Plug Adapters via NAS for Multi-Domain Learning

no code implementations • 24 Jul 2020 • Hanbin Zhao, Hao Zeng, Xin Qin, Yongjian Fu, Hui Wang, Bourahla Omar, Xi Li

As an important and challenging problem, multi-domain learning (MDL) typically seeks for a set of effective lightweight domain-specific adapter modules plugged into a common domain-agnostic network.

Neural Architecture Search

Paper
Add Code

A Multi-Semantic Metapath Model for Large Scale Heterogeneous Network Representation Learning

no code implementations • 19 Jul 2020 • Xuandong Zhao, Jinbao Xue, Jin Yu, Xi Li, Hongxia Yang

In real-world applications, networks usually consist of billions of various types of nodes and edges with abundant attributes.

Link Prediction Network Embedding

Paper
Add Code

Multitask Non-Autoregressive Model for Human Motion Prediction

no code implementations • 13 Jul 2020 • Bin Li, Jian Tian, Zhongfei Zhang, Hailin Feng, Xi Li

Human motion prediction, which aims at predicting future human skeletons given the past ones, is a typical sequence-to-sequence problem.

Action Recognition Human motion prediction +2

Paper
Add Code

MgSvF: Multi-Grained Slow vs. Fast Framework for Few-Shot Class-Incremental Learning

no code implementations • 28 Jun 2020 • Hanbin Zhao, Yongjian Fu, Mintong Kang, Qi Tian, Fei Wu, Xi Li

As a challenging problem, few-shot class-incremental learning (FSCIL) continually learns a sequence of tasks, confronting the dilemma between slow forgetting of old knowledge and fast adaptation to new knowledge.

Few-Shot Class-Incremental Learning Incremental Learning

Paper
Add Code

Epoch-evolving Gaussian Process Guided Learning

1 code implementation • 25 Jun 2020 • Jiabao Cui, XueWei Li, Bin Li, Hanbin Zhao, Bourahla Omar, Xi Li

In this paper, we propose a novel learning scheme called epoch-evolving Gaussian Process Guided Learning (GPGL), which aims at characterizing the correlation information between the batch-level distribution and the global data distribution.

Paper
Code

A Survey on Generative Adversarial Networks: Variants, Applications, and Training

no code implementations • 9 Jun 2020 • Abdul Jabbar, Xi Li, Bourahla Omar

We survey, (I) the original GAN model and its modified classical versions, (II) detail analysis of various GAN applications in different domains, (III) detail study about the various GAN training obstacles as well as training solutions.

Paper
Add Code

ResKD: Residual-Guided Knowledge Distillation

no code implementations • 8 Jun 2020 • Xuewei Li, Songyuan Li, Bourahla Omar, Fei Wu, Xi Li

In this paper, we see knowledge distillation in a fresh light, using the knowledge gap, or the residual, between a teacher and a student as guidance to train a much more lightweight student, called a res-student.

Knowledge Distillation

Paper
Add Code

CoDiNet: Path Distribution Modeling with Consistency and Diversity for Dynamic Routing

1 code implementation • 29 May 2020 • Huanyu Wang, Zequn Qin, Songyuan Li, Xi Li

In this paper, we see dynamic routing networks in a fresh light, formulating a routing method as a mapping from a sample space to a routing space.

Model Compression

Paper
Code

Unsupervised segmentation via semantic-apparent feature fusion

no code implementations • 21 May 2020 • Xi Li, Huimin Ma, Hongbing Ma, Yidong Wang

In order to solve this problem, the research proposes an unsupervised foreground segmentation method based on semantic-apparent feature fusion (SAFF).

Foreground Segmentation Segmentation

Paper
Add Code

Tamed Warping Network for High-Resolution Semantic Video Segmentation

no code implementations • 4 May 2020 • Songyuan Li, Junyi Feng, Xi Li

Based on the feature fusion, our Context Feature Rectification~(CFR) module learns the model's difference from a per-frame model to correct the warped features.

Motion Estimation Real-Time Semantic Segmentation +2

Paper
Add Code

Semantic Neighborhood-Aware Deep Facial Expression Recognition

no code implementations • 27 Apr 2020 • Yongjian Fu, Xintian Wu, Xi Li, Zhijie Pan, Daxin Luo

Different from many other attributes, facial expression can change in a continuous way, and therefore, a slight semantic change of input should also lead to the output fluctuation limited in a small scale.

Facial Expression Recognition Facial Expression Recognition (FER)

Paper
Add Code

Ultra Fast Structure-aware Deep Lane Detection

8 code implementations • ECCV 2020 • Zequn Qin, Huanyu Wang, Xi Li

Modern methods mainly regard lane detection as a problem of pixel-wise segmentation, which is struggling to address the problem of challenging scenarios and speed.

Ranked #48 on Lane Detection on CULane

Lane Detection

1,718

Paper
Code

Progressive Multi-Stage Learning for Discriminative Tracking

no code implementations • 1 Apr 2020 • Weichao Li, Xi Li, Omar Elfarouk Bourahla, Fuxian Huang, Fei Wu, Wei Liu, Zhiheng Wang, Hongmin Liu

Visual tracking is typically solved as a discriminative learning problem that usually requires high-quality samples for online model adaptation.

Visual Tracking

Paper
Add Code

Real-Time Semantic Segmentation via Auto Depth, Downsampling Joint Decision and Feature Aggregation

no code implementations • 31 Mar 2020 • Peng Sun, Jiaxiang Wu, Songyuan Li, Peiwen Lin, Junzhou Huang, Xi Li

To satisfy the stringent requirements on computational resources in the field of real-time semantic segmentation, most approaches focus on the hand-crafted design of light-weight segmentation networks.

Neural Architecture Search Real-Time Semantic Segmentation +1

Paper
Add Code

BANet: Bidirectional Aggregation Network with Occlusion Handling for Panoptic Segmentation

no code implementations • CVPR 2020 • Yifeng Chen, Guangchen Lin, Songyuan Li, Bourahla Omar, Yiming Wu, Fangfang Wang, Junyi Feng, Mingliang Xu, Xi Li

Panoptic segmentation aims to perform instance segmentation for foreground instances and semantic segmentation for background stuff simultaneously.

Instance Segmentation Occlusion Handling +2

Paper
Add Code

TapLab: A Fast Framework for Semantic Video Segmentation Tapping into Compressed-Domain Knowledge

1 code implementation • 30 Mar 2020 • Junyi Feng, Songyuan Li, Xi Li, Fei Wu, Qi Tian, Ming-Hsuan Yang, Haibin Ling

Real-time semantic video segmentation is a challenging task due to the strict requirements of inference speed.

Image Segmentation Semantic Segmentation +2

Paper
Code

Realizing Pixel-Level Semantic Learning in Complex Driving Scenes based on Only One Annotated Pixel per Class

no code implementations • 10 Mar 2020 • Xi Li, Huimin Ma, Sheng Yi, Yanxian Chen

Semantic segmentation tasks based on weakly supervised condition have been put forward to achieve a lightweight labeling process.

Segmentation Weakly supervised Semantic Segmentation +1

Paper
Add Code

WSOD with PSNet and Box Regression

no code implementations • 26 Nov 2019 • Sheng Yi, Xi Li, Huimin Ma

To solve this problem, we added the box regression module to the weakly supervised object detection network and proposed a proposal scoring network (PSNet) to supervise it.

Ranked #16 on Weakly Supervised Object Detection on PASCAL VOC 2007

Object object-detection +3

Paper
Add Code

Graph-guided Architecture Search for Real-time Semantic Segmentation

1 code implementation • CVPR 2020 • Peiwen Lin, Peng Sun, Guangliang Cheng, Sirui Xie, Xi Li, Jianping Shi

Unlike previous works that use a simplified search space and stack a repeatable cell to form a network, we introduce a novel search mechanism with new search space where a lightweight model can be effectively explored through the cell-level diversity and latencyoriented constraint.

Real-Time Semantic Segmentation

Paper
Code

Adaptive Graph Representation Learning for Video Person Re-identification

1 code implementation • 5 Sep 2019 • Yiming Wu, Omar El Farouk Bourahla, Xi Li, Fei Wu, Qi Tian, Xue Zhou

While correlations between parts are ignored in the previous methods, to leverage the relations of different parts, we propose an innovative adaptive graph representation learning scheme for video person Re-ID, which enables the contextual interactions between relevant regional features.

Ranked #3 on Person Re-Identification on PRID2011

Graph Representation Learning Video-Based Person Re-Identification

Paper
Code

Spatiotemporal graph routing for skeleton-based action recognition

no code implementations • Thirty-Third AAAI Conference on Artificial Intelligence, 2019 • Bin Li, Xi Li, Zhongfei Zhang, Fei Wu

With the representation effectiveness, skeleton-based human action recognition has received considerable research attention, and has a wide range of real applications.

Ranked #25 on Skeleton Based Action Recognition on Kinetics-Skeleton dataset

Action Recognition Clustering +2

Paper
Add Code

OVSNet : Towards One-Pass Real-Time Video Object Segmentation

no code implementations • 24 May 2019 • Peng Sun, Peiwen Lin, Guangliang Cheng, Jianping Shi, Jiawan Zhang, Xi Li

Video object segmentation aims at accurately segmenting the target object regions across consecutive frames.

Object object-detection +6

Paper
Add Code

On a caching system with object sharing

1 code implementation • 18 May 2019 • George Kesidis, Nader Alfares, Xi Li, Bhuvan Urgaonkar, Mahmut Kandemir, Takis Konstantopoulos

We consider a content-caching system thatis shared by a number of proxies.

Performance Networking and Internet Architecture

Paper
Code

Deep Q Learning Driven CT Pancreas Segmentation with Geometry-Aware U-Net

no code implementations • 19 Apr 2019 • Yunze Man, Yangsibo Huang, Junyi Feng, Xi Li, Fei Wu

Segmentation of pancreas is important for medical image analysis, yet it faces great challenges of class imbalance, background distractions and non-rigid geometrical features.

Pancreas Segmentation Q-Learning +1

Paper
Add Code

Perceiving Physical Equation by Observing Visual Scenarios

no code implementations • 29 Nov 2018 • Siyu Huang, Zhi-Qi Cheng, Xi Li, Xiao Wu, Zhongfei Zhang, Alexander Hauptmann

To tackle this challenge, we present a novel pipeline comprised of an Observer Engine and a Physicist Engine by respectively imitating the actions of an observer and a physicist in the real world.

Paper
Add Code

GroundNet: Monocular Ground Plane Normal Estimation with Geometric Consistency

no code implementations • 17 Nov 2018 • Yunze Man, Xinshuo Weng, Xi Li, Kris Kitani

We focus on estimating the 3D orientation of the ground plane from a single image.

Line Detection Segmentation +1

Paper
Add Code

Context-Aware Deep Spatio-Temporal Network for Hand Pose Estimation from Depth Images

no code implementations • 6 Oct 2018 • Yiming Wu, Wei Ji, Xi Li, Gang Wang, Jianwei Yin, Fei Wu

As a fundamental and challenging problem in computer vision, hand pose estimation aims to estimate the hand joint locations from depth images.

Hand Pose Estimation

Paper
Add Code

Stacked Pooling: Improving Crowd Counting by Boosting Scale Invariance

1 code implementation • 22 Aug 2018 • Siyu Huang, Xi Li, Zhi-Qi Cheng, Zhongfei Zhang, Alexander Hauptmann

In this work, we explore the cross-scale similarity in crowd counting scenario, in which the regions of different scales often exhibit high visual similarity.

Crowd Counting Density Estimation

Paper
Code

Weakly-Supervised Semantic Segmentation by Iteratively Mining Common Object Features

no code implementations • CVPR 2018 • Xiang Wang, ShaoDi You, Xi Li, Huimin Ma

Then in the top-down step, the refined object regions are used as supervision to train the segmentation network and to predict object masks.

General Classification Object +3

Paper
Add Code

Geometry-Aware Scene Text Detection With Instance Transformation Network

no code implementations • CVPR 2018 • Fangfang Wang, Liming Zhao, Xi Li, Xinchao Wang, DaCheng Tao

Localizing text in the wild is challenging in the situations of complicated geometric layout of the targets like random orientation and large aspect ratio.

General Classification Multi-Task Learning +5

Paper
Add Code

State Distribution-aware Sampling for Deep Q-learning

no code implementations • 23 Apr 2018 • Weichao Li, Fuxian Huang, Xi Li, Gang Pan, Fei Wu

A critical and challenging problem in reinforcement learning is how to learn the state-action value function from the experience replay buffer and simultaneously keep sample efficiency and faster convergence to a high quality solution.

Atari Games OpenAI Gym +1

Paper
Add Code

GNAS: A Greedy Neural Architecture Search Method for Multi-Attribute Learning

no code implementations • 19 Apr 2018 • Siyu Huang, Xi Li, Zhi-Qi Cheng, Zhongfei Zhang, Alexander Hauptmann

A key problem in deep multi-attribute learning is to effectively discover the inter-attribute correlation structures.

Attribute Neural Architecture Search

Paper
Add Code

Multi-Channel Pyramid Person Matching Network for Person Re-Identification

no code implementations • 7 Mar 2018 • Chaojie Mao, Yingming Li, Yaqing Zhang, Zhongfei Zhang, Xi Li

In particular, we learn separate deep representations for semantic-components and color-texture distributions from two person images and then employ pyramid person matching network (PPMN) to obtain correspondence representations.

Person Re-Identification

Paper
Add Code

Pyramid Person Matching Network for Person Re-identification

no code implementations • 7 Mar 2018 • Chaojie Mao, Yingming Li, Zhongfei Zhang, Yaqing Zhang, Xi Li

In this work, we present a deep convolutional pyramid person matching network (PPMN) with specially designed Pyramid Matching Module to address the problem of person re-identification.

Person Re-Identification

Paper
Add Code

DR-Net: Transmission Steered Single Image Dehazing Network with Weakly Supervised Refinement

no code implementations • 2 Dec 2017 • Chongyi Li, Jichang Guo, Fatih Porikli, Chunle Guo, Huzhu Fu, Xi Li

Despite the recent progress in image dehazing, several problems remain largely unsolved such as robustness for varying scenes, the visual quality of reconstructed images, and effectiveness and flexibility for applications.

Image Dehazing Single Image Dehazing +1

Paper
Add Code

Deep Air Learning: Interpolation, Prediction, and Feature Analysis of Fine-grained Air Quality

no code implementations • 2 Nov 2017 • Zhongang Qi, Tianchun Wang, Guojie Song, Weisong Hu, Xi Li, Zhongfei, Zhang

The interpolation, prediction, and feature analysis of fine-gained air quality are three important topics in the area of urban air computing.

feature selection

Paper
Add Code

Boosted Zero-Shot Learning with Semantic Correlation Regularization

no code implementations • 25 Jul 2017 • Te Pi, Xi Li, Zhongfei, Zhang

For adaptable knowledge transfer, we devise a Semantic Correlation Regularization (SCR) approach to regularize the boosted model to be consistent with the inter-class semantic correlations.

Model Selection Transfer Learning +1

Paper
Add Code

Graph-Theoretic Spatiotemporal Context Modeling for Video Saliency Detection

no code implementations • 25 Jul 2017 • Lina Wei, Fangfang Wang, Xi Li, Fei Wu, Jun Xiao

As a result, a key issue in video saliency detection is how to effectively capture the intrinsical properties of atomic video structures as well as their associated contextual interactions along the spatial and temporal dimensions.

Video Saliency Detection

Paper
Add Code

Group-wise Deep Co-saliency Detection

no code implementations • 24 Jul 2017 • Lina Wei, Shanshan Zhao, Omar El Farouk Bourahla, Xi Li, Fei Wu

In this paper, we propose an end-to-end group-wise deep co-saliency detection approach to address the co-salient object discovery problem based on the fully convolutional network (FCN) with group input and group output.

Co-Salient Object Detection Object Discovery +1

Paper
Add Code

Deep Optical Flow Estimation Via Multi-Scale Correspondence Structure Learning

no code implementations • 23 Jul 2017 • Shanshan Zhao, Xi Li, Omar El Farouk Bourahla

Therefore, a key issue to solve in this area is how to effectively model the multi-scale correspondence structure properties in an adaptive end-to-end learning fashion.

Optical Flow Estimation

Paper
Add Code

Deeply-Learned Part-Aligned Representations for Person Re-Identification

1 code implementation • ICCV 2017 • Liming Zhao, Xi Li, Jingdong Wang, Yueting Zhuang

In this paper, we address the problem of person re-identification, which refers to associating the persons captured from different cameras.

Ranked #106 on Person Re-Identification on Market-1501

Person Re-Identification

Paper
Code

Transductive Zero-Shot Learning with a Self-training dictionary approach

no code implementations • 27 Mar 2017 • Yunlong Yu, Zhong Ji, Xi Li, Jichang Guo, Zhongfei Zhang, Haibin Ling, Fei Wu

As an important and challenging problem in computer vision, zero-shot learning (ZSL) aims at automatically recognizing the instances from unseen object classes without training data.

Transductive Learning Transfer Learning +1

Paper
Add Code

Deep Convolutional Neural Networks with Merge-and-Run Mappings

4 code implementations • 23 Nov 2016 • Liming Zhao, Jingdong Wang, Xi Li, Zhuowen Tu, Wen-Jun Zeng

A deep residual network, built by stacking a sequence of residual blocks, is easy to train, because identity mappings skip residual branches and thus improve information flow.

228

Paper
Code

DLAU: A Scalable Deep Learning Accelerator Unit on FPGA

no code implementations • 23 May 2016 • Chao Wang, Qi Yu, Lei Gong, Xi Li, Yuan Xie, Xuehai Zhou

As the emerging field of machine learning, deep learning shows excellent ability in solving complex learning problems.

Paper
Add Code

Deep Learning Driven Visual Path Prediction from a Single Image

no code implementations • 27 Jan 2016 • Siyu Huang, Xi Li, Zhongfei Zhang, Zhouzhou He, Fei Wu, Wei Liu, Jinhui Tang, Yueting Zhuang

The highly effective visual representation and deep context models ensure that our framework makes a deep semantic understanding of the scene and motion pattern, consequently improving the performance of the visual path prediction task.

Paper
Add Code

3D Hand Pose Estimation Using Randomized Decision Forest With Segmentation Index Points

no code implementations • ICCV 2015 • Peiyi Li, Haibin Ling, Xi Li, Chunyuan Liao

In this paper, we propose a real-time 3D hand pose estimation algorithm using the randomized decision forest framework.

3D Hand Pose Estimation

Paper
Add Code

DeepSaliency: Multi-Task Deep Neural Network Model for Salient Object Detection

no code implementations • 19 Oct 2015 • Xi Li, Liming Zhao, Lina Wei, Ming-Hsuan Yang, Fei Wu, Yueting Zhuang, Haibin Ling, Jingdong Wang

A key problem in salient object detection is how to effectively model the semantic properties of salient objects in a data-driven manner.

Image Segmentation Multi-Task Learning +6

Paper
Add Code

Online Metric-Weighted Linear Representations for Robust Visual Tracking

no code implementations • 21 Jul 2015 • Xi Li, Chunhua Shen, Anthony Dick, Zhongfei Zhang, Yueting Zhuang

Object identification results for an entire video sequence are achieved by systematically combining the tracking information and visual recognition at each frame.

Metric Learning Object +2

Paper
Add Code

Metric Learning Driven Multi-Task Structured Output Optimization for Robust Keypoint Tracking

no code implementations • 4 Dec 2014 • Liming Zhao, Xi Li, Jun Xiao, Fei Wu, Yueting Zhuang

As an important and challenging problem in computer vision and graphics, keypoint-based object tracking is typically formulated in a spatio-temporal statistical learning framework.

Metric Learning Object Tracking