Search Results for author: Xin Li

Found 368 papers, 157 papers with code

Vector-quantized Image Modeling with Improved VQGAN

5 code implementations ICLR 2022 Jiahui Yu, Xin Li, Jing Yu Koh, Han Zhang, Ruoming Pang, James Qin, Alexander Ku, Yuanzhong Xu, Jason Baldridge, Yonghui Wu

Motivated by this success, we explore a Vector-quantized Image Modeling (VIM) approach that involves pretraining a Transformer to predict rasterized image tokens autoregressively.

Image Generation Representation Learning +1

Local Patch AutoAugment with Multi-Agent Collaboration

2 code implementations20 Mar 2021 Shiqi Lin, Tao Yu, Ruoyu Feng, Xin Li, Xin Jin, Zhibo Chen

We formulate it as a multi-agent reinforcement learning (MARL) problem, where each agent learns an augmentation policy for each patch based on its content together with the semantics of the whole image.

Data Augmentation Fine-Grained Image Recognition +2

Mitigating Object Hallucinations in Large Vision-Language Models through Visual Contrastive Decoding

2 code implementations28 Nov 2023 Sicong Leng, Hang Zhang, Guanzheng Chen, Xin Li, Shijian Lu, Chunyan Miao, Lidong Bing

Large Vision-Language Models (LVLMs) have advanced considerably, intertwining visual recognition and language understanding to generate content that is not only coherent but also contextually attuned.

Hallucination Object

Drafting and Revision: Laplacian Pyramid Network for Fast High-Quality Artistic Style Transfer

2 code implementations CVPR 2021 Tianwei Lin, Zhuoqi Ma, Fu Li, Dongliang He, Xin Li, Errui Ding, Nannan Wang, Jie Li, Xinbo Gao

Inspired by the common painting process of drawing a draft and revising the details, we introduce a novel feed-forward method named Laplacian Pyramid Network (LapStyle).

Style Transfer

Deep Concept-wise Temporal Convolutional Networks for Action Localization

2 code implementations26 Aug 2019 Xin Li, Tianwei Lin, Xiao Liu, Chuang Gan, WangMeng Zuo, Chao Li, Xiang Long, Dongliang He, Fu Li, Shilei Wen

In this paper, we empirically find that stacking more conventional temporal convolution layers actually deteriorates action classification performance, possibly ascribing to that all channels of 1D feature map, which generally are highly abstract and can be regarded as latent concepts, are excessively recombined in temporal convolution.

Action Classification Action Localization

BMN: Boundary-Matching Network for Temporal Action Proposal Generation

15 code implementations ICCV 2019 Tianwei Lin, Xiao Liu, Xin Li, Errui Ding, Shilei Wen

To address these difficulties, we introduce the Boundary-Matching (BM) mechanism to evaluate confidence scores of densely distributed proposals, which denote a proposal as a matching pair of starting and ending boundaries and combine all densely distributed BM pairs into the BM confidence map.

Action Detection Action Recognition +1

Video-LLaMA: An Instruction-tuned Audio-Visual Language Model for Video Understanding

1 code implementation5 Jun 2023 Hang Zhang, Xin Li, Lidong Bing

We present Video-LLaMA a multi-modal framework that empowers Large Language Models (LLMs) with the capability of understanding both visual and auditory content in the video.

Language Modelling Text Generation +7

Scaling Autoregressive Models for Content-Rich Text-to-Image Generation

2 code implementations22 Jun 2022 Jiahui Yu, Yuanzhong Xu, Jing Yu Koh, Thang Luong, Gunjan Baid, ZiRui Wang, Vijay Vasudevan, Alexander Ku, Yinfei Yang, Burcu Karagol Ayan, Ben Hutchinson, Wei Han, Zarana Parekh, Xin Li, Han Zhang, Jason Baldridge, Yonghui Wu

We present the Pathways Autoregressive Text-to-Image (Parti) model, which generates high-fidelity photorealistic images and supports content-rich synthesis involving complex compositions and world knowledge.

Machine Translation Text-to-Image Generation +1

Paint Transformer: Feed Forward Neural Painting with Stroke Prediction

2 code implementations ICCV 2021 Songhua Liu, Tianwei Lin, Dongliang He, Fu Li, Ruifeng Deng, Xin Li, Errui Ding, Hao Wang

Neural painting refers to the procedure of producing a series of strokes for a given image and non-photo-realistically recreating it using neural networks.

Object Detection Reinforcement Learning (RL) +1

Diffusion Models for Image Restoration and Enhancement -- A Comprehensive Survey

1 code implementation18 Aug 2023 Xin Li, Yulin Ren, Xin Jin, Cuiling Lan, Xingrui Wang, Wenjun Zeng, Xinchao Wang, Zhibo Chen

Image restoration (IR) has been an indispensable and challenging task in the low-level vision field, which strives to improve the subjective quality of images distorted by various forms of degradation.

Deblurring Image Restoration +2

Drive Like a Human: Rethinking Autonomous Driving with Large Language Models

1 code implementation14 Jul 2023 Daocheng Fu, Xin Li, Licheng Wen, Min Dou, Pinlong Cai, Botian Shi, Yu Qiao

In this paper, we explore the potential of using a large language model (LLM) to understand the driving environment in a human-like manner and analyze its ability to reason, interpret, and memorize when facing complex scenarios.

Autonomous Driving Common Sense Reasoning +3

DiLu: A Knowledge-Driven Approach to Autonomous Driving with Large Language Models

2 code implementations28 Sep 2023 Licheng Wen, Daocheng Fu, Xin Li, Xinyu Cai, Tao Ma, Pinlong Cai, Min Dou, Botian Shi, Liang He, Yu Qiao

Recent advancements in autonomous driving have relied on data-driven approaches, which are widely adopted but face challenges including dataset bias, overfitting, and uninterpretability.

Autonomous Driving Common Sense Reasoning +1

On the Road with GPT-4V(ision): Early Explorations of Visual-Language Model on Autonomous Driving

1 code implementation9 Nov 2023 Licheng Wen, Xuemeng Yang, Daocheng Fu, XiaoFeng Wang, Pinlong Cai, Xin Li, Tao Ma, Yingxuan Li, Linran Xu, Dengke Shang, Zheng Zhu, Shaoyan Sun, Yeqi Bai, Xinyu Cai, Min Dou, Shuanglu Hu, Botian Shi, Yu Qiao

This has been a significant bottleneck, particularly in the development of common sense reasoning and nuanced scene understanding necessary for safe and reliable autonomous driving.

Autonomous Driving Common Sense Reasoning +4

AdaAttN: Revisit Attention Mechanism in Arbitrary Neural Style Transfer

3 code implementations ICCV 2021 Songhua Liu, Tianwei Lin, Dongliang He, Fu Li, Meiling Wang, Xin Li, Zhengxing Sun, Qian Li, Errui Ding

Finally, the content feature is normalized so that they demonstrate the same local feature statistics as the calculated per-point weighted style feature statistics.

Style Transfer Video Style Transfer

GRIP++: Enhanced Graph-based Interaction-aware Trajectory Prediction for Autonomous Driving

5 code implementations arXiv preprint 2020 Xin Li, Xiaowen Ying, Mooi Choo Chuah

Despite the advancement in the technology of autonomous driving cars, the safety of a self-driving car is still a challenging problem that has not been well studied.

Autonomous Driving motion prediction +1

SMILEtrack: SiMIlarity LEarning for Occlusion-Aware Multiple Object Tracking

2 code implementations16 Nov 2022 Yu-Hsiang Wang, Jun-Wei Hsieh, Ping-Yang Chen, Ming-Ching Chang, Hung Hin So, Xin Li

Second, we develop a Similarity Matching Cascade (SMC) module with a novel GATE function for robust object matching across consecutive video frames, further enhancing MOT performance.

 Ranked #1 on Multi-Object Tracking on MOT20 (using extra training data)

Multi-Object Tracking Multiple Object Tracking +3

Partial Order Pruning: for Best Speed/Accuracy Trade-off in Neural Architecture Search

2 code implementations CVPR 2019 Xin Li, Yiming Zhou, Zheng Pan, Jiashi Feng

It prunes the architecture search space with a partial order assumption to automatically search for the architectures with the best speed and accuracy trade-off.

Neural Architecture Search

Transformation Networks for Target-Oriented Sentiment Classification

2 code implementations ACL 2018 Xin Li, Lidong Bing, Wai Lam, Bei Shi

Between the two layers, we propose a component to generate target-specific representations of words in the sentence, meanwhile incorporate a mechanism for preserving the original contextual information from the RNN layer.

Aspect-Based Sentiment Analysis (ABSA) Classification +3

scCDCG: Efficient Deep Structural Clustering for single-cell RNA-seq via Deep Cut-informed Graph Embedding

2 code implementations9 Apr 2024 Ping Xu, Zhiyuan Ning, Meng Xiao, Guihai Feng, Xin Li, Yuanchun Zhou, Pengfei Wang

Addressing these limitations, we introduce scCDCG (single-cell RNA-seq Clustering via Deep Cut-informed Graph), a novel framework designed for efficient and accurate clustering of scRNA-seq data that simultaneously utilizes intercellular high-order structural information.

Clustering Dimensionality Reduction +2

RF-Net: An End-to-End Image Matching Network based on Receptive Field

1 code implementation CVPR 2019 Xuelun Shen, Cheng Wang, Xin Li, Zenglei Yu, Jonathan Li, Chenglu Wen, Ming Cheng, Zijian He

This paper proposes a new end-to-end trainable matching network based on receptive field, RF-Net, to compute sparse correspondence between images.

Keypoint Detection

NTIRE 2022 Challenge on Efficient Super-Resolution: Methods and Results

2 code implementations11 May 2022 Yawei Li, Kai Zhang, Radu Timofte, Luc van Gool, Fangyuan Kong, Mingxi Li, Songwei Liu, Zongcai Du, Ding Liu, Chenhui Zhou, Jingyi Chen, Qingrui Han, Zheyuan Li, Yingqi Liu, Xiangyu Chen, Haoming Cai, Yu Qiao, Chao Dong, Long Sun, Jinshan Pan, Yi Zhu, Zhikai Zong, Xiaoxiao Liu, Zheng Hui, Tao Yang, Peiran Ren, Xuansong Xie, Xian-Sheng Hua, Yanbo Wang, Xiaozhong Ji, Chuming Lin, Donghao Luo, Ying Tai, Chengjie Wang, Zhizhong Zhang, Yuan Xie, Shen Cheng, Ziwei Luo, Lei Yu, Zhihong Wen, Qi Wu1, Youwei Li, Haoqiang Fan, Jian Sun, Shuaicheng Liu, Yuanfei Huang, Meiguang Jin, Hua Huang, Jing Liu, Xinjian Zhang, Yan Wang, Lingshun Long, Gen Li, Yuanfan Zhang, Zuowei Cao, Lei Sun, Panaetov Alexander, Yucong Wang, Minjie Cai, Li Wang, Lu Tian, Zheyuan Wang, Hongbing Ma, Jie Liu, Chao Chen, Yidong Cai, Jie Tang, Gangshan Wu, Weiran Wang, Shirui Huang, Honglei Lu, Huan Liu, Keyan Wang, Jun Chen, Shi Chen, Yuchun Miao, Zimo Huang, Lefei Zhang, Mustafa Ayazoğlu, Wei Xiong, Chengyi Xiong, Fei Wang, Hao Li, Ruimian Wen, Zhijing Yang, Wenbin Zou, Weixin Zheng, Tian Ye, Yuncheng Zhang, Xiangzhen Kong, Aditya Arora, Syed Waqas Zamir, Salman Khan, Munawar Hayat, Fahad Shahbaz Khan, Dandan Gaoand Dengwen Zhouand Qian Ning, Jingzhu Tang, Han Huang, YuFei Wang, Zhangheng Peng, Haobo Li, Wenxue Guan, Shenghua Gong, Xin Li, Jun Liu, Wanjun Wang, Dengwen Zhou, Kun Zeng, Hanjiang Lin, Xinyu Chen, Jinsheng Fang

The aim was to design a network for single image super-resolution that achieved improvement of efficiency measured according to several metrics including runtime, parameters, FLOPs, activations, and memory consumption while at least maintaining the PSNR of 29. 00dB on DIV2K validation set.

Image Super-Resolution

LSOTB-TIR:A Large-Scale High-Diversity Thermal Infrared Object Tracking Benchmark

1 code implementation3 Aug 2020 Qiao Liu, Xin Li, Zhenyu He, Chenglong Li, Jun Li, Zikun Zhou, Di Yuan, Jing Li, Kai Yang, Nana Fan, Feng Zheng

We evaluate and analyze more than 30 trackers on LSOTB-TIR to provide a series of baselines, and the results show that deep trackers achieve promising performance.

Thermal Infrared Object Tracking Vocal Bursts Intensity Prediction

nnMobileNet: Rethinking CNN for Retinopathy Research

2 code implementations2 Jun 2023 Wenhui Zhu, Peijie Qiu, Xiwen Chen, Xin Li, Natasha Lepore, Oana M. Dumitrascu, Yalin Wang

Over the past few decades, convolutional neural networks (CNNs) have been at the forefront of the detection and tracking of various retinal diseases (RD).

Diabetic Retinopathy Grading

SeaLLMs -- Large Language Models for Southeast Asia

1 code implementation1 Dec 2023 Xuan-Phi Nguyen, Wenxuan Zhang, Xin Li, Mahani Aljunied, Qingyu Tan, Liying Cheng, Guanzheng Chen, Yue Deng, Sen yang, Chaoqun Liu, Hang Zhang, Lidong Bing

Despite the remarkable achievements of large language models (LLMs) in various tasks, there remains a linguistic bias that favors high-resource languages, such as English, often at the expense of low-resource and regional languages.

Instruction Following

VisEvent: Reliable Object Tracking via Collaboration of Frame and Event Flows

2 code implementations11 Aug 2021 Xiao Wang, Jianing Li, Lin Zhu, Zhipeng Zhang, Zhe Chen, Xin Li, YaoWei Wang, Yonghong Tian, Feng Wu

Different from visible cameras which record intensity images frame by frame, the biologically inspired event camera produces a stream of asynchronous and sparse events with much lower latency.

Object Tracking

Micron-BERT: BERT-based Facial Micro-Expression Recognition

1 code implementation CVPR 2023 Xuan-Bac Nguyen, Chi Nhan Duong, Xin Li, Susan Gauch, Han-Seok Seo, Khoa Luu

By incorporating these components into an end-to-end deep network, the proposed $\mu$-BERT significantly outperforms all previous work in various micro-expression tasks.

Micro Expression Recognition Micro-Expression Recognition +1

A Survey on Aspect-Based Sentiment Analysis: Tasks, Methods, and Challenges

1 code implementation2 Mar 2022 Wenxuan Zhang, Xin Li, Yang Deng, Lidong Bing, Wai Lam

More specifically, we provide a new taxonomy for ABSA which organizes existing studies from the axes of concerned sentiment elements, with an emphasis on recent advances of compound ABSA tasks.

Aspect-Based Sentiment Analysis Aspect-Based Sentiment Analysis (ABSA)

SCPNet: Semantic Scene Completion on Point Cloud

1 code implementation CVPR 2023 Zhaoyang Xia, Youquan Liu, Xin Li, Xinge Zhu, Yuexin Ma, Yikang Li, Yuenan Hou, Yu Qiao

We propose a simple yet effective label rectification strategy, which uses off-the-shelf panoptic segmentation labels to remove the traces of dynamic objects in completion labels, greatly improving the performance of deep models especially for those moving objects.

3D Semantic Scene Completion Knowledge Distillation +3

Aspect Sentiment Quad Prediction as Paraphrase Generation

1 code implementation EMNLP 2021 Wenxuan Zhang, Yang Deng, Xin Li, Yifei Yuan, Lidong Bing, Wai Lam

Aspect-based sentiment analysis (ABSA) has been extensively studied in recent years, which typically involves four fundamental sentiment elements, including the aspect category, aspect term, opinion term, and sentiment polarity.

Aspect-Based Sentiment Analysis Aspect-Based Sentiment Analysis (ABSA) +2

PTB-TIR: A Thermal Infrared Pedestrian Tracking Benchmark

1 code implementation18 Jan 2018 Qiao Liu, Zhenyu He, Xin Li, Yuan Zheng

The ability to evaluate the TIR pedestrian tracker fairly, on a benchmark dataset, is significant for the development of this field.

Attribute Thermal Infrared Object Tracking

CLEX: Continuous Length Extrapolation for Large Language Models

1 code implementation25 Oct 2023 Guanzheng Chen, Xin Li, Zaiqiao Meng, Shangsong Liang, Lidong Bing

We generalise the PE scaling approaches to model the continuous dynamics by ordinary differential equations over the length scaling factor, thereby overcoming the constraints of current PE scaling methods designed for specific lengths.

4k Position

Deep Models with Fusion Strategies for MVP Point Cloud Registration

1 code implementation18 Oct 2021 Lifa Zhu, Changwei Lin, Dongrui Liu, Xin Li, Francisco Gómez-Fernández

The main goal of point cloud registration in Multi-View Partial (MVP) Challenge 2021 is to estimate a rigid transformation to align a point cloud pair.

Point Cloud Registration

GraphAdapter: Tuning Vision-Language Models With Dual Knowledge Graph

1 code implementation NeurIPS 2023 Xin Li, Dongze Lian, Zhihe Lu, Jiawang Bai, Zhibo Chen, Xinchao Wang

To mitigate that, we propose an effective adapter-style tuning strategy, dubbed GraphAdapter, which performs the textual adapter by explicitly modeling the dual-modality structure knowledge (i. e., the correlation of different semantics/classes in textual and visual modalities) with a dual knowledge graph.

Transfer Learning

Aspect Term Extraction with History Attention and Selective Transformation

1 code implementation2 May 2018 Xin Li, Lidong Bing, Piji Li, Wai Lam, Zhimou Yang

Aspect Term Extraction (ATE), a key sub-task in Aspect-Based Sentiment Analysis, aims to extract explicit aspect expressions from online user reviews.

Aspect-Based Sentiment Analysis Aspect-Based Sentiment Analysis (ABSA) +2

Neural Color Operators for Sequential Image Retouching

2 code implementations17 Jul 2022 Yili Wang, Xin Li, Kun Xu, Dongliang He, Qi Zhang, Fu Li, Errui Ding

The neural color operator mimics the behavior of traditional color operators and learns pixelwise color transformation while its strength is controlled by a scalar.

Image Enhancement Image Retouching

Image Inpainting by End-to-End Cascaded Refinement with Mask Awareness

1 code implementation28 Apr 2021 Manyu Zhu, Dongliang He, Xin Li, Chao Li, Fu Li, Xiao Liu, Errui Ding, Zhaoxiang Zhang

Inpainting arbitrary missing regions is challenging because learning valid features for various masked regions is nontrivial.

Image Inpainting valid

Learning Semantic Person Image Generation by Region-Adaptive Normalization

1 code implementation CVPR 2021 Zhengyao Lv, Xiaoming Li, Xin Li, Fu Li, Tianwei Lin, Dongliang He, WangMeng Zuo

In the first stage, we predict the target semantic parsing maps to eliminate the difficulties of pose transfer and further benefit the latter translation of per-region appearance style.

Pose Transfer Semantic Parsing +1

JigsawNet: Shredded Image Reassembly using Convolutional Neural Network and Loop-based Composition

3 code implementations11 Sep 2018 Canyu Le, Xin Li

Existing reassembly pipelines commonly consist of a local matching stage and a global compositions stage.

Pyramid Mask Text Detector

1 code implementation28 Mar 2019 Jingchao Liu, Xuebo Liu, Jie Sheng, Ding Liang, Xin Li, Qingjie Liu

Scene text detection, an essential step of scene text recognition system, is to locate text instances in natural scene images automatically.

Clustering Instance Segmentation +4

NM-Net: Mining Reliable Neighbors for Robust Feature Correspondences

1 code implementation CVPR 2019 Chen Zhao, Zhiguo Cao, Chi Li, Xin Li, Jiaqi Yang

Feature correspondence selection is pivotal to many feature-matching based tasks in computer vision.

Image-to-Image Translation with Deep Reinforcement Learning

1 code implementation24 Sep 2023 Xin Wang, Ziwei Luo, Jing Hu, Chengming Feng, Shu Hu, Bin Zhu, Xi Wu, Xin Li, Siwei Lyu

The key feature in the RL-I2IT framework is to decompose a monolithic learning process into small steps with a lightweight model to progressively transform a source image successively to a target image.

Auxiliary Learning Decision Making +3

Exploiting Coarse-to-Fine Task Transfer for Aspect-level Sentiment Classification

1 code implementation AAAI 2019 2018 Zheng Li, Ying WEI, Yu Zhang, Xiang Zhang, Xin Li, Qiang Yang

Aspect-level sentiment classification (ASC) aims at identifying sentiment polarities towards aspects in a sentence, where the aspect can behave as a general Aspect Category (AC) or a specific Aspect Term (AT).

General Classification Sentence +2

MGeo: Multi-Modal Geographic Pre-Training Method

1 code implementation11 Jan 2023 Ruixue Ding, Boli Chen, Pengjun Xie, Fei Huang, Xin Li, Qiang Zhang, Yao Xu

Single-modal PTMs can barely make use of the important GC and therefore have limited performance.

Language Modelling

Cascade Graph Neural Networks for RGB-D Salient Object Detection

1 code implementation ECCV 2020 Ao Luo, Xin Li, Fan Yang, Zhicheng Jiao, Hong Cheng, Siwei Lyu

Current works either simply distill prior knowledge from the corresponding depth map for handling the RGB-image or blindly fuse color and geometric information to generate the coarse depth-aware representations, hindering the performance of RGB-D saliency detectors. In this work, we introduceCascade Graph Neural Networks(Cas-Gnn), a unified framework which is capable of comprehensively distilling and reasoning the mutual benefits between these two data sources through a set of cascade graphs, to learn powerful representations for RGB-D salient object detection.

Object object-detection +3

Mutual Graph Learning for Camouflaged Object Detection

1 code implementation CVPR 2021 Qiang Zhai, Xin Li, Fan Yang, Chenglizhao Chen, Hong Cheng, Deng-Ping Fan

Automatically detecting/segmenting object(s) that blend in with their surroundings is difficult for current models.

Graph Learning Object +2

Learning Distortion Invariant Representation for Image Restoration from A Causality Perspective

2 code implementations CVPR 2023 Xin Li, Bingchen Li, Xin Jin, Cuiling Lan, Zhibo Chen

In this paper, we are the first to propose a novel training strategy for image restoration from the causality perspective, to improve the generalization ability of DNNs for unknown degradations.

counterfactual Image Restoration +2

MixNet: Toward Accurate Detection of Challenging Scene Text in the Wild

1 code implementation23 Aug 2023 Yu-Xiang Zeng, Jun-Wei Hsieh, Xin Li, Ming-Ching Chang

Detecting small scene text instances in the wild is particularly challenging, where the influence of irregular positions and nonideal lighting often leads to detection errors.

Scene Text Detection Text Detection

SeD: Semantic-Aware Discriminator for Image Super-Resolution

1 code implementation29 Feb 2024 Bingchen Li, Xin Li, Hanxin Zhu, Yeying Jin, Ruoyu Feng, Zhizheng Zhang, Zhibo Chen

In particular, one discriminator is utilized to enable the SR network to learn the distribution of real-world high-quality images in an adversarial training manner.

Image Super-Resolution

Saliency-Associated Object Tracking

1 code implementation ICCV 2021 Zikun Zhou, Wenjie Pei, Xin Li, Hongpeng Wang, Feng Zheng, Zhenyu He

A potential limitation of such trackers is that not all patches are equally informative for tracking.

Object Object Tracking

Learning Optical Flow with Adaptive Graph Reasoning

1 code implementation8 Feb 2022 Ao Luo, Fan Yang, Kunming Luo, Xin Li, Haoqiang Fan, Shuaicheng Liu

Our key idea is to decouple the context reasoning from the matching procedure, and exploit scene information to effectively assist motion estimation by learning to reason over the adaptive graph.

Motion Estimation Optical Flow Estimation +1

Multi-Task Driven Feature Models for Thermal Infrared Tracking

1 code implementation26 Nov 2019 Qiao Liu, Xin Li, Zhenyu He, Nana Fan, Di Yuan, Wei Liu, Yonsheng Liang

These two feature models are learned using a multi-task matching framework and are jointly optimized on the TIR tracking task.

Thermal Infrared Object Tracking

GAFlow: Incorporating Gaussian Attention into Optical Flow

1 code implementation ICCV 2023 Ao Luo, Fan Yang, Xin Li, Lang Nie, Chunyu Lin, Haoqiang Fan, Shuaicheng Liu

Moreover, for reliable motion analysis, we provide a new Gaussian-Guided Attention Module (GGAM) which not only inherits properties from Gaussian distribution to instinctively revolve around the neighbor fields of each point but also is empowered to put the emphasis on contextually related regions during matching.

Optical Flow Estimation Representation Learning

Detecting Multimedia Generated by Large AI Models: A Survey

1 code implementation22 Jan 2024 Li Lin, Neeraj Gupta, Yue Zhang, Hainan Ren, Chun-Hao Liu, Feng Ding, Xin Wang, Xin Li, Luisa Verdoliva, Shu Hu

The rapid advancement of Large AI Models (LAIMs), particularly diffusion models and large language models, has marked a new era where AI-generated multimedia is increasingly integrated into various aspects of daily life.

Horizontally Fused Training Array: An Effective Hardware Utilization Squeezer for Training Novel Deep Learning Models

2 code implementations3 Feb 2021 Shang Wang, Peiming Yang, Yuxuan Zheng, Xin Li, Gennady Pekhimenko

Driven by the tremendous effort in researching novel deep learning (DL) algorithms, the training cost of developing new models increases staggeringly in recent years.

DDGCN: A Dynamic Directed Graph Convolutional Network for Action Recognition

1 code implementation ECCV 2020 Matthew Korban, Xin Li

We propose a Dynamic Directed Graph Convolutional Network (DDGCN) to model spatial and temporal features of human actions from their skeletal representations.

Action Recognition

CiteTracker: Correlating Image and Text for Visual Tracking

1 code implementation ICCV 2023 Xin Li, Yuqing Huang, Zhenyu He, YaoWei Wang, Huchuan Lu, Ming-Hsuan Yang

Existing visual tracking methods typically take an image patch as the reference of the target to perform tracking.

Attribute Descriptive +2

KVQ: Kwai Video Quality Assessment for Short-form Videos

1 code implementation11 Feb 2024 Yiting Lu, Xin Li, Yajing Pei, Kun Yuan, Qizhi Xie, Yunpeng Qu, Ming Sun, Chao Zhou, Zhibo Chen

Short-form UGC video platforms, like Kwai and TikTok, have been an emerging and irreplaceable mainstream media form, thriving on user-friendly engagement, and kaleidoscope creation, etc.

Video Quality Assessment Visual Question Answering (VQA)

NomMer: Nominate Synergistic Context in Vision Transformer for Visual Recognition

1 code implementation CVPR 2022 Hao liu, Xinghua Jiang, Xin Li, Zhimin Bao, Deqiang Jiang, Bo Ren

For the sake of trade-off between efficiency and performance, a group of works merely perform SA operation within local patches, whereas the global contextual information is abandoned, which would be indispensable for visual recognition tasks.

object-detection Object Detection +1

CoIn: Contrastive Instance Feature Mining for Outdoor 3D Object Detection with Very Limited Annotations

1 code implementation ICCV 2023 Qiming Xia, Jinhao Deng, Chenglu Wen, Hai Wu, Shaoshuai Shi, Xin Li, Cheng Wang

Combining CoIn with an iterative training strategy, we propose a CoIn++ pipeline, which requires only 2% annotations in the KITTI dataset to achieve performance comparable to the fully supervised methods.

3D Object Detection Contrastive Learning +2

Low-Light Image Enhancement with Multi-Stage Residue Quantization and Brightness-Aware Attention

1 code implementation ICCV 2023 Yunlong Liu, Tao Huang, Weisheng Dong, Fangfang Wu, Xin Li, Guangming Shi

Deep learning-based LLIE methods focus on learning a mapping function between low-light images and normal-light images that outperforms conventional LLIE methods.

Low-Light Image Enhancement Quantization

Contour Knowledge Transfer for Salient Object Detection

1 code implementation ECCV 2018 Xin Li, Fan Yang, Hong Cheng, Wei Liu, Dinggang Shen

Our goal is to overcome this limitation by automatically converting an existing deep contour detection model into a salient object detection model without using any manual salient object masks.

Contour Detection Object +4

COVID-MobileXpert: On-Device COVID-19 Patient Triage and Follow-up using Chest X-rays

1 code implementation6 Apr 2020 Xin Li, Chengyin Li, Dongxiao Zhu

We design and implement a novel three-player knowledge transfer and distillation (KTD) framework including a pre-trained attending physician (AP) network that extracts CXR imaging features from a large scale of lung disease CXR images, a fine-tuned resident fellow (RF) network that learns the essential CXR imaging features to discriminate COVID-19 from pneumonia and/or normal cases with a small amount of COVID-19 cases, and a trained lightweight medical student (MS) network to perform on-device COVID-19 patient triage and follow-up.

Computed Tomography (CT) Trajectory Prediction +1

Unsupervised Learning of Accurate Siamese Tracking

1 code implementation CVPR 2022 Qiuhong Shen, Lei Qiao, Jinyang Guo, Peixia Li, Xin Li, Bo Li, Weitao Feng, Weihao Gan, Wei Wu, Wanli Ouyang

As unlimited self-supervision signals can be obtained by tracking a video along a cycle in time, we investigate evolving a Siamese tracker by tracking videos forward-backward.

Visual Object Tracking

Improving Fine-grained Entity Typing with Entity Linking

1 code implementation IJCNLP 2019 Hongliang Dai, Donghong Du, Xin Li, Yangqiu Song

Fine-grained entity typing is a challenging problem since it usually involves a relatively large tag set and may require to understand the context of the entity mention.

Entity Linking Entity Typing +1

Face Beautification: Beyond Makeup Transfer

1 code implementation8 Dec 2019 Xudong Liu, Ruizhe Wang, Chih-Fan Chen, Minglei Yin, Hao Peng, Shukhan Ng, Xin Li

Inspired by the latest advances in style-based synthesis and face beauty prediction, we propose a novel framework of face beautification.

Translation

A Chinese Corpus for Fine-grained Entity Typing

1 code implementation LREC 2020 Chin Lee, Hongliang Dai, Yangqiu Song, Xin Li

In this paper, we introduce a corpus for Chinese fine-grained entity typing that contains 4, 800 mentions manually labeled through crowdsourcing.

Cross-Lingual Transfer Entity Typing +1

Model Attribution of Face-swap Deepfake Videos

1 code implementation25 Feb 2022 Shan Jia, Xin Li, Siwei Lyu

Then we take Deepfakes model attribution as a multiclass classification task and propose a spatial and temporal attention based method to explore the differences among Deepfakes in the new dataset.

Attribute Face Swapping

Hierarchical Spatial-aware Siamese Network for Thermal Infrared Object Tracking

1 code implementation27 Nov 2017 Xin Li, Qiao Liu, Nana Fan, Zhenyu He, Hongzhi Wang

In this paper, we cast the TIR tracking problem as a similarity verification task, which is coupled well to the objective of the tracking task.

General Classification Thermal Infrared Object Tracking

An Informative Tracking Benchmark

1 code implementation13 Dec 2021 Xin Li, Qiao Liu, Wenjie Pei, Qiuhong Shen, YaoWei Wang, Huchuan Lu, Ming-Hsuan Yang

Along with the rapid progress of visual tracking, existing benchmarks become less informative due to redundancy of samples and weak discrimination between current trackers, making evaluations on all datasets extremely time-consuming.

Visual Tracking

Neuro-Symbolic Integration Brings Causal and Reliable Reasoning Proofs

1 code implementation16 Nov 2023 Sen yang, Xin Li, Leyang Cui, Lidong Bing, Wai Lam

Though prompting LLMs with various reasoning structures produces reasoning proofs along with answers, these proofs are not ensured to be causal and reliable due to the inherent defects of LLMs.

GSM8K

MHSA-Net: Multi-Head Self-Attention Network for Occluded Person Re-Identification

1 code implementation10 Aug 2020 Hongchen Tan, Xiuping Liu, BaoCai Yin, Xin Li

This paper presents a novel person re-identification model, named Multi-Head Self-Attention Network (MHSA-Net), to prune unimportant information and capture key local information from person images.

Person Re-Identification

Beyond Sole Strength: Customized Ensembles for Generalized Vision-Language Models

1 code implementation28 Nov 2023 Zhihe Lu, Jiawang Bai, Xin Li, Zeyu Xiao, Xinchao Wang

However, performance advancements are limited when relying solely on intricate algorithmic designs for a single model, even one exhibiting strong performance, e. g., CLIP-ViT-B/16.

Prompt Engineering

Once Upon a $\textit{Time}$ in $\textit{Graph}$: Relative-Time Pretraining for Complex Temporal Reasoning

1 code implementation23 Oct 2023 Sen yang, Xin Li, Lidong Bing, Wai Lam

However, the knowledge-time association is usually insufficient for the downstream tasks that require reasoning over temporal dependencies between knowledge.

Question Answering

Learning Deep Multi-Level Similarity for Thermal Infrared Object Tracking

1 code implementation9 Jun 2019 Qiao Liu, Xin Li, Zhenyu He, Nana Fan, Di Yuan, Hongpeng Wang

These two similarities complement each other and hence enhance the discriminative capacity of the network for handling distractors.

Semantic Similarity Thermal Infrared Object Tracking

DeepReduce: A Sparse-tensor Communication Framework for Distributed Deep Learning

1 code implementation NeurIPS 2021 Kelly Kostopoulou, Hang Xu, Aritra Dutta, Xin Li, Alexandros Ntoulas, Panos Kalnis

This paper introduces DeepReduce, a versatile framework for the compressed communication of sparse tensors, tailored for distributed deep learning.

DeepReduce: A Sparse-tensor Communication Framework for Federated Deep Learning

1 code implementation NeurIPS 2021 Hang Xu, Kelly Kostopoulou, Aritra Dutta, Xin Li, Alexandros Ntoulas, Panos Kalnis

DeepReduce is orthogonal to existing gradient sparsifiers and can be applied in conjunction with them, transparently to the end-user, to significantly lower the communication overhead.

ConNER: Consistency Training for Cross-lingual Named Entity Recognition

1 code implementation17 Nov 2022 Ran Zhou, Xin Li, Lidong Bing, Erik Cambria, Luo Si, Chunyan Miao

We propose ConNER as a novel consistency training framework for cross-lingual NER, which comprises of: (1) translation-based consistency training on unlabeled target-language data, and (2) dropoutbased consistency training on labeled source-language data.

Cross-Lingual NER Knowledge Distillation +3

No trends in spring and autumn phenology during the global warming hiatus

1 code implementation Nature Communications 2019 Xufeng Wang, Jingfeng Xiao, Xin Li, Guodong Cheng, Mingguo Ma, Gaofeng Zhu, M. Altaf Arain, T. Andrew Black & Rachhpal S. Jassal

Phenology plays a fundamental role in regulating photosynthesis, evapotranspiration, and surface energy fluxes and is sensitive to climate change.

Rotation Invariant Point Cloud Classification: Where Local Geometry Meets Global Topology

1 code implementation1 Nov 2019 Chen Zhao, Jiaqi Yang, Xin Xiong, Angfan Zhu, Zhiguo Cao, Xin Li

To the best of our knowledge, this work is the first principled approach toward adaptively combining global and local information under the context of RI point cloud analysis.

General Classification Point Cloud Classification

A Detector-oblivious Multi-arm Network for Keypoint Matching

1 code implementation2 Apr 2021 Xuelun Shen, Cheng Wang, Xin Li, Qian Hu, Jingyi Zhang

This paper presents a matching network to establish point correspondence between images.

Multilingual AMR Parsing with Noisy Knowledge Distillation

1 code implementation Findings (EMNLP) 2021 Deng Cai, Xin Li, Jackie Chun-Sing Ho, Lidong Bing, Wai Lam

We study multilingual AMR parsing from the perspective of knowledge distillation, where the aim is to learn and improve a multilingual AMR parser by using an existing English parser as its teacher.

AMR Parsing Knowledge Distillation

SwinIQA: Learned Swin Distance for Compressed Image Quality Assessment

1 code implementation9 May 2022 Jianzhao Liu, Xin Li, Yanding Peng, Tao Yu, Zhibo Chen

In this paper, we design a full-reference image quality assessment metric SwinIQA to measure the perceptual quality of compressed images in a learned Swin distance space.

Compressed Image Quality Assessment Image Compression +1

Manifold Learning of Four-dimensional Scanning Transmission Electron Microscopy

1 code implementation18 Oct 2018 Xin Li, Ondrej E. Dyck, Mark P. Oxley, Andrew R. Lupini, Leland McInnes, John Healy, Stephen Jesse, Sergei V. Kalinin

Four-dimensional scanning transmission electron microscopy (4D-STEM) of local atomic diffraction patterns is emerging as a powerful technique for probing intricate details of atomic structure and atomic electric fields.

Probabilistic Model Distillation for Semantic Correspondence

1 code implementation CVPR 2021 Xin Li, Deng-Ping Fan, Fan Yang, Ao Luo, Hong Cheng, Zicheng Liu

We address this problem with the use of a novel Probabilistic Model Distillation (PMD) approach which transfers knowledge learned by a probabilistic teacher model on synthetic data to a static student model with the use of unlabeled real image pairs.

Representation Learning Semantic correspondence

Enhancing Multilingual Language Model with Massive Multilingual Knowledge Triples

1 code implementation22 Nov 2021 Linlin Liu, Xin Li, Ruidan He, Lidong Bing, Shafiq Joty, Luo Si

In this work, we explore methods to make better use of the multilingual annotation and language agnostic property of KG triples, and present novel knowledge based multilingual language models (KMLMs) trained directly on the knowledge triples.

Knowledge Graphs Language Modelling +9

HST: Hierarchical Swin Transformer for Compressed Image Super-resolution

3 code implementations21 Aug 2022 Bingchen Li, Xin Li, Yiting Lu, Sen Liu, Ruoyu Feng, Zhibo Chen

Compressed Image Super-resolution has achieved great attention in recent years, where images are degraded with compression artifacts and low-resolution artifacts.

Compressed Image Super-resolution Image Super-Resolution

Towards Robust Low-Resource Fine-Tuning with Multi-View Compressed Representations

1 code implementation16 Nov 2022 Linlin Liu, Xingxuan Li, Megh Thakkar, Xin Li, Shafiq Joty, Luo Si, Lidong Bing

Due to the huge amount of parameters, fine-tuning of pretrained language models (PLMs) is prone to overfitting in the low resource scenarios.

WL-Align: Weisfeiler-Lehman Relabeling for Aligning Users across Networks via Regularized Representation Learning

1 code implementation29 Dec 2022 Li Liu, Penggang Chen, Xin Li, William K. Cheung, Youmin Zhang, Qun Liu, Guoyin Wang

Aligning users across networks using graph representation learning has been found effective where the alignment is accomplished in a low-dimensional embedding space.

Graph Representation Learning

Domain-adversarial Network Alignment

1 code implementation15 Aug 2019 Huiting Hong, Xin Li, Yuangang Pan, Ivor Tsang

Network alignment is a critical task to a wide variety of fields.

Network Embedding

Toward Tag-free Aspect Based Sentiment Analysis: A Multiple Attention Network Approach

3 code implementations22 Mar 2020 Yao Qiang, Xin Li, Dongxiao Zhu

Existing aspect based sentiment analysis (ABSA) approaches leverage various neural network models to extract the aspect sentiments via learning aspect-specific feature representations.

Aspect-Based Sentiment Analysis Aspect-Based Sentiment Analysis (ABSA) +1

Improving Adversarial Robustness via Probabilistically Compact Loss with Logit Constraints

1 code implementation14 Dec 2020 Xin Li, Xiangrui Li, Deng Pan, Dongxiao Zhu

This inspires us to propose a new Probabilistically Compact (PC) loss with logit constraints which can be used as a drop-in replacement for cross-entropy (CE) loss to improve CNN's adversarial robustness.

Adversarial Robustness

SimSR: Simple Distance-based State Representation for Deep Reinforcement Learning

2 code implementations31 Dec 2021 Hongyu Zang, Xin Li, Mingzhong Wang

This work explores how to learn robust and generalizable state representation from image-based observations with deep reinforcement learning methods.

reinforcement-learning Reinforcement Learning (RL)

Semantic-aware Message Broadcasting for Efficient Unsupervised Domain Adaptation

1 code implementation6 Dec 2022 Xin Li, Cuiling Lan, Guoqiang Wei, Zhibo Chen

In this way, our message broadcasting encourages the group tokens to learn more informative and diverse information for effective domain alignment.

Pseudo Label Unsupervised Domain Adaptation

SPARTAN: Self-supervised Spatiotemporal Transformers Approach to Group Activity Recognition

1 code implementation6 Mar 2023 Naga VS Raviteja Chappa, Pha Nguyen, Alexander H Nelson, Han-Seok Seo, Xin Li, Page Daniel Dobbs, Khoa Luu

In this paper, we propose a new, simple, and effective Self-supervised Spatio-temporal Transformers (SPARTAN) approach to Group Activity Recognition (GAR) using unlabeled video data.

Group Activity Recognition

Improving Self-training for Cross-lingual Named Entity Recognition with Contrastive and Prototype Learning

1 code implementation23 May 2023 Ran Zhou, Xin Li, Lidong Bing, Erik Cambria, Chunyan Miao

In cross-lingual named entity recognition (NER), self-training is commonly used to bridge the linguistic gap by training on pseudo-labeled target-language data.

Cross-Lingual NER named-entity-recognition +4

Task-driven Semantic Coding via Reinforcement Learning

1 code implementation7 Jun 2021 Xin Li, Jun Shi, Zhibo Chen

However, the traditional hybrid coding framework cannot be optimized in an end-to-end manner, which makes task-driven semantic fidelity metric unable to be automatically integrated into the rate-distortion optimization process.

Face Detection License Plate Detection +4

Retrofitting Multilingual Sentence Embeddings with Abstract Meaning Representation

1 code implementation18 Oct 2022 Deng Cai, Xin Li, Jackie Chun-Sing Ho, Lidong Bing, Wai Lam

Unlike most prior work that only evaluates the ability to measure semantic similarity, we present a thorough evaluation of existing multilingual sentence embeddings and our improved versions, which include a collection of five transfer tasks in different downstream applications.

Semantic Similarity Semantic Textual Similarity +2

RTracker: Recoverable Tracking via PN Tree Structured Memory

1 code implementation28 Mar 2024 Yuqing Huang, Xin Li, Zikun Zhou, YaoWei Wang, Zhenyu He, Ming-Hsuan Yang

Upon the PN tree memory, we develop corresponding walking rules for determining the state of the target and define a set of control flows to unite the tracker and the detector in different tracking scenarios.

Compressed Sensing of Scanning Transmission Electron Microscopy (STEM) on Non-Rectangular Scans

1 code implementation13 May 2018 Xin Li, Ondrej Dyck, Sergei V. Kalinin, Stephen Jesse

Scanning Transmission Electron Microscopy (STEM) has become the main stay for materials characterization on atomic level, with applications ranging from visualization of localized and extended defects to mapping order parameter fields.

DAC: Data-free Automatic Acceleration of Convolutional Networks

1 code implementation20 Dec 2018 Xin Li, Shuai Zhang, Bolan Jiang, Yingyong Qi, Mooi Choo Chuah, Ning Bi

A complex deep learning model with high accuracy runs slowly on resource-limited devices, while a light-weight model that runs much faster loses accuracy.

Image Classification Multi-Person Pose Estimation +2

Probabilistic prediction of the heave motions of a semi-submersible by a deep learning problem model

1 code implementation9 Oct 2021 Xiaoxian Guo, Xiantao Zhang, Xinliang Tian, Wenyue Lu, Xin Li

In this study, we extend a deep learning (DL) model, which could predict the heave and surge motions of a floating semi-submersible 20 to 50 seconds ahead with good accuracy, to quantify its uncertainty of the predictive time series with the help of the dropout technique.

Motion Compensation motion prediction +2

Behavior Prior Representation learning for Offline Reinforcement Learning

1 code implementation2 Nov 2022 Hongyu Zang, Xin Li, Jie Yu, Chen Liu, Riashat Islam, Remi Tachet des Combes, Romain Laroche

Our method, Behavior Prior Representation (BPR), learns state representations with an easy-to-integrate objective based on behavior cloning of the dataset: we first learn a state representation by mimicking actions from the dataset, and then train a policy on top of the fixed representation, using any off-the-shelf Offline RL algorithm.

Offline RL reinforcement-learning +2

From Cloze to Comprehension: Retrofitting Pre-trained Masked Language Model to Pre-trained Machine Reader

1 code implementation9 Dec 2022 Weiwen Xu, Xin Li, Wenxuan Zhang, Meng Zhou, Wai Lam, Luo Si, Lidong Bing

We present Pre-trained Machine Reader (PMR), a novel method for retrofitting pre-trained masked language models (MLMs) to pre-trained machine reading comprehension (MRC) models without acquiring labeled data.

Classification Extractive Question-Answering +6

mPMR: A Multilingual Pre-trained Machine Reader at Scale

1 code implementation23 May 2023 Weiwen Xu, Xin Li, Wai Lam, Lidong Bing

mPMR aims to guide multilingual pre-trained language models (mPLMs) to perform natural language understanding (NLU) including both sequence classification and span extraction in multiple languages.

Classification Machine Reading Comprehension +3

Dual-view Correlation Hybrid Attention Network for Robust Holistic Mammogram Classification

1 code implementation19 Jun 2023 Zhiwei Wang, Junlin Xian, Kangyi Liu, Xin Li, Qiang Li, Xin Yang

Mammogram image is important for breast cancer screening, and typically obtained in a dual-view form, i. e., cranio-caudal (CC) and mediolateral oblique (MLO), to provide complementary information.

Clinical Knowledge

The Algonauts Project 2023 Challenge: UARK-UAlbany Team Solution

1 code implementation1 Aug 2023 Xuan-Bac Nguyen, Xudong Liu, Xin Li, Khoa Luu

The goal is to predict brain responses across the entire visual brain, as it is the region where the most reliable responses to images have been observed.

SBGAR: Semantics Based Group Activity Recognition

1 code implementation ICCV 2017 Xin Li, Mooi Choo Chuah

Activity recognition has become an important function in many emerging computer vision applications e. g. automatic video surveillance system, human-computer interaction application, and video recommendation system, etc.

Group Activity Recognition

On the Learning Property of Logistic and Softmax Losses for Deep Neural Networks

1 code implementation4 Mar 2020 Xiangrui Li, Xin Li, Deng Pan, Dongxiao Zhu

Deep convolutional neural networks (CNNs) trained with logistic and softmax losses have made significant advancement in visual recognition tasks in computer vision.

Binary Classification Classification +2

Aspect-based Sentiment Analysis in Question Answering Forums

1 code implementation Findings (EMNLP) 2021 Wenxuan Zhang, Yang Deng, Xin Li, Lidong Bing, Wai Lam

This motivates us to investigate the task of ABSA on QA forums (ABSA-QA), aiming to jointly detect the discussed aspects and their sentiment polarities for a given QA pair.

Aspect-Based Sentiment Analysis Aspect-Based Sentiment Analysis (ABSA) +2

PeerDA: Data Augmentation via Modeling Peer Relation for Span Identification Tasks

1 code implementation17 Oct 2022 Weiwen Xu, Xin Li, Yang Deng, Wai Lam, Lidong Bing

Specifically, a novel Peer Data Augmentation (PeerDA) approach is proposed which employs span pairs with the PR relation as the augmentation data for training.

Data Augmentation Relation

AQE: Argument Quadruplet Extraction via a Quad-Tagging Augmented Generative Approach

1 code implementation31 May 2023 Jia Guo, Liying Cheng, Wenxuan Zhang, Stanley Kok, Xin Li, Lidong Bing

In this work, we for the first time propose a challenging argument quadruplet extraction task (AQE), which can provide an all-in-one extraction of four argumentative components, i. e., claims, evidence, evidence types, and stances.

Argument Mining Stance Classification +1

A Real-Time Deep Network for Crowd Counting

1 code implementation16 Feb 2020 Xiaowen Shi, Xin Li, Caili Wu, Shuchen Kong, Jing Yang, Liang He

Automatic analysis of highly crowded people has attracted extensive attention from computer vision research.

Crowd Counting

A streamlined Approach to Multimodal Few-Shot Class Incremental Learning for Fine-Grained Datasets

2 code implementations10 Mar 2024 Thang Doan, Sima Behpour, Xin Li, Wenbin He, Liang Gou, Liu Ren

Few-shot Class-Incremental Learning (FSCIL) poses the challenge of retaining prior knowledge while learning from limited new data streams, all without overfitting.

Few-Shot Class-Incremental Learning Incremental Learning

On Improving Deep Reinforcement Learning for POMDPs

1 code implementation26 Apr 2017 Pengfei Zhu, Xin Li, Pascal Poupart, Guanghui Miao

Deep Reinforcement Learning (RL) recently emerged as one of the most competitive approaches for learning in sequential decision making problems with fully observable environments, e. g., computer Go.

Atari Games Decision Making +4

A Saliency-Guided Street View Image Inpainting Framework for Efficient Last-Meters Wayfinding

1 code implementation14 May 2022 Chuanbo Hu, Shan Jia, Fan Zhang, Xin Li

However, due to the large diversity of geographic context and acquisition conditions, the captured SVI always contains various distracting objects (e. g., pedestrians and vehicles), which will distract human visual attention from efficiently finding the destination in the last few meters.

Image Inpainting object-detection +2

Fusion-based Few-Shot Morphing Attack Detection and Fingerprinting

1 code implementation27 Oct 2022 Na Zhang, Shan Jia, Siwei Lyu, Xin Li

Our technical contributions include: 1) We propose a fusion-based few-shot learning (FSL) method to learn discriminative features that can generalize to unseen morphing attack types from predefined presentation attacks; 2) The proposed FSL based on the fusion of the PRNU model and Noiseprint network is extended from binary MAD to multiclass morphing attack fingerprinting (MAF).

Face Recognition Few-Shot Learning

Negative Flux Aggregation to Estimate Feature Attributions

1 code implementation17 Jan 2023 Xin Li, Deng Pan, Chengyin Li, Yao Qiang, Dongxiao Zhu

There are increasing demands for understanding deep neural networks' (DNNs) behavior spurred by growing security and/or transparency concerns.

On the K-theory of crossed products by automorphic semigroup actions

1 code implementation24 May 2012 Joachim Cuntz, Siegfried Echterhoff, Xin Li

Let P be a semigroup that admits an embedding into a group G. Assume that the embedding satisfies a certain Toeplitz condition and that the Baum-Connes conjecture holds for G. We prove a formula describing the K- theory of the reduced crossed product A \rtimes{\alpha}, r P by any automorphic action of P. This formula is obtained as a consequence of a result on the K-theory of crossed products for special actions of G on totally disconnected spaces.

Operator Algebras Dynamical Systems K-Theory and Homology 46L05, 46L80 (Primary) 20Mxx, 11R04 (Secondary)

TEAC: Intergrating Trust Region and Max Entropy Actor Critic for Continuous Control

1 code implementation1 Jan 2021 Hongyu Zang, Xin Li, Li Zhang, Peiyao Zhao, Mingzhong Wang

Trust region methods and maximum entropy methods are two state-of-the-art branches used in reinforcement learning (RL) for the benefits of stability and exploration in continuous environments, respectively.

Continuous Control Reinforcement Learning (RL)

Muti-view Mouse Social Behaviour Recognition with Deep Graphical Model

1 code implementation4 Nov 2020 Zheheng Jiang, Feixiang Zhou, Aite Zhao, Xin Li, Ling Li, DaCheng Tao, Xuelong Li, Huiyu Zhou

To address this problem, we here propose a novel multiview latent-attention and dynamic discriminative model that jointly learns view-specific and view-shared sub-structures, where the former captures unique dynamics of each view whilst the latter encodes the interaction between the views.

Exploiting Semantic Relations for Fine-grained Entity Typing

1 code implementation AKBC 2020 Hongliang Dai, Yangqiu Song, Xin Li

We find that, in some cases, existing neural fine-grained entity typing models may ignore the semantic information in the context that is important for typing.

Entity Typing Relation +2

DR-GAN: Distribution Regularization for Text-to-Image Generation

1 code implementation17 Apr 2022 Hongchen Tan, Xiuping Liu, BaoCai Yin, Xin Li

This paper presents a new Text-to-Image generation model, named Distribution Regularization Generative Adversarial Network (DR-GAN), to generate images from text descriptions from improved distribution learning.

Generative Adversarial Network Text-to-Image Generation

Agent-Controller Representations: Principled Offline RL with Rich Exogenous Information

1 code implementation31 Oct 2022 Riashat Islam, Manan Tomar, Alex Lamb, Yonathan Efroni, Hongyu Zang, Aniket Didolkar, Dipendra Misra, Xin Li, Harm van Seijen, Remi Tachet des Combes, John Langford

We find that contemporary representation learning techniques can fail on datasets where the noise is a complex and time dependent process, which is prevalent in practical applications.

Offline RL Reinforcement Learning (RL) +1

COMQ: A Backpropagation-Free Algorithm for Post-Training Quantization

1 code implementation11 Mar 2024 Aozhong zhang, Zi Yang, Naigang Wang, Yingyong Qin, Jack Xin, Xin Li, Penghang Yin

Within a fixed layer, COMQ treats all the scaling factor(s) and bit-codes as the variables of the reconstruction error.

Quantization

GANE: A Generative Adversarial Network Embedding

no code implementations18 May 2018 Huiting Hong, Xin Li, Mingzhong Wang

Network embedding has become a hot research topic recently which can provide low-dimensional feature representations for many machine learning applications.

Clustering Generative Adversarial Network +2

On Improving Deep Reinforcement Learning for POMDPs

no code implementations17 Apr 2018 Pengfei Zhu, Xin Li, Pascal Poupart, Guanghui Miao

Deep Reinforcement Learning (RL) recently emerged as one of the most competitive approaches for learning in sequential decision making problems with fully observable environments, e. g., computer Go.

Atari Games Decision Making +4

Perceptually Optimized Generative Adversarial Network for Single Image Dehazing

no code implementations3 May 2018 Yixin Du, Xin Li

To overcome this weakness, we propose a direct deep learning approach toward image dehazing bypassing the step of transmission map estimation and facilitating end-to-end perceptual optimization.

Denoising Generative Adversarial Network +2

Weighted Low-Rank Approximation of Matrices and Background Modeling

no code implementations15 Apr 2018 Aritra Dutta, Xin Li, Peter Richtarik

We primarily study a special a weighted low-rank approximation of matrices and then apply it to solve the background modeling problem.

ReHAR: Robust and Efficient Human Activity Recognition

no code implementations27 Feb 2018 Xin Li, Mooi Choo Chuah

The whole model is trained end-to-end to allow meaningful representations to be generated for the final activity recognition.

Human Activity Recognition Optical Flow Estimation

Joint Demosaicing and Denoising with Perceptual Optimization on a Generative Adversarial Network

no code implementations13 Feb 2018 Weishong Dong, Ming Yuan, Xin Li, Guangming Shi

Image demosaicing - one of the most important early stages in digital camera pipelines - addressed the problem of reconstructing a full-resolution image from so-called color-filter-arrays.

Demosaicking Denoising +2

Learning with Rethinking: Recurrently Improving Convolutional Neural Networks through Feedback

no code implementations15 Aug 2017 Xin Li, Zequn Jie, Jiashi Feng, Changsong Liu, Shuicheng Yan

However, most of the existing CNN models only learn features through a feedforward structure and no feedback information from top to bottom layers is exploited to enable the networks to refine themselves.

Prune the Convolutional Neural Networks with Sparse Shrink

no code implementations8 Aug 2017 Xin Li, Changsong Liu

These results have demonstrated the effectiveness of our "Sparse Shrink" algorithm.

FoveaNet: Perspective-aware Urban Scene Parsing

no code implementations ICCV 2017 Xin Li, Zequn Jie, Wei Wang, Changsong Liu, Jimei Yang, Xiaohui Shen, Zhe Lin, Qiang Chen, Shuicheng Yan, Jiashi Feng

Thus, they suffer from heterogeneous object scales caused by perspective projection of cameras on actual scenes and inevitably encounter parsing failures on distant objects as well as other boundary and recognition errors.

Scene Parsing

Weighted Low Rank Approximation for Background Estimation Problems

no code implementations4 Jul 2017 Aritra Dutta, Xin Li

Classical principal component analysis (PCA) is not robust to the presence of sparse outliers in the data.

A Batch-Incremental Video Background Estimation Model using Weighted Low-Rank Approximation of Matrices

no code implementations2 Jul 2017 Aritra Dutta, Xin Li, Peter Richtárik

Principal component pursuit (PCP) is a state-of-the-art approach for background estimation problems.

ESE: Efficient Speech Recognition Engine with Sparse LSTM on FPGA

no code implementations1 Dec 2016 Song Han, Junlong Kang, Huizi Mao, Yiming Hu, Xin Li, Yubin Li, Dongliang Xie, Hong Luo, Song Yao, Yu Wang, Huazhong Yang, William J. Dally

Evaluated on the LSTM for speech recognition benchmark, ESE is 43x and 3x faster than Core i7 5930k CPU and Pascal Titan X GPU implementations.

Quantization speech-recognition +1

Cross-scale predictive dictionaries

no code implementations16 Nov 2015 Vishwanath Saragadam, Xin Li, Aswin Sankaranarayanan

Sparse representations using data dictionaries provide an efficient model particularly for signals that do not enjoy alternate analytic sparsifying transformations.

Video Scene Parsing with Predictive Feature Learning

no code implementations ICCV 2017 Xiaojie Jin, Xin Li, Huaxin Xiao, Xiaohui Shen, Zhe Lin, Jimei Yang, Yunpeng Chen, Jian Dong, Luoqi Liu, Zequn Jie, Jiashi Feng, Shuicheng Yan

In this way, the network can effectively learn to capture video dynamics and temporal context, which are critical clues for video scene parsing, without requiring extra manual annotations.

Representation Learning Scene Parsing

Detecting Suicidal Ideation in Chinese Microblogs with Psychological Lexicons

no code implementations4 Nov 2014 Xiaolei Huang, Lei Zhang, Tianli Liu, David Chiu, Tingshao Zhu, Xin Li

Currently, we have identified 53 known suicidal cases who posted suicide notes on Weibo prior to their deaths. We explore linguistic features of these known cases using a psychological lexicon dictionary, and train an effective suicidal Weibo post detection model.

BIG-bench Machine Learning

Learning Hybrid Sparsity Prior for Image Restoration: Where Deep Learning Meets Sparse Coding

no code implementations18 Jul 2018 Fangfang Wu, Weisheng Dong, Guangming Shi, Xin Li

State-of-the-art approaches toward image restoration can be classified into model-based and learning-based.

Image Restoration

Superimposition-guided Facial Reconstruction from Skull

no code implementations28 Sep 2018 Celong Liu, Xin Li

We develop a new algorithm to perform facial reconstruction from a given skull.

Facial Inpainting

Learning Parametric Sparse Models for Image Super-Resolution

no code implementations NeurIPS 2016 Yongbo Li, Weisheng Dong, Xuemei Xie, Guangming Shi, Xin Li, Donglai Xu

More specifically, the parametric sparse prior of the desirable high-resolution (HR) image patches are learned from both the input low-resolution (LR) image and a training image dataset.

Image Super-Resolution

CONet: A Cognitive Ocean Network

no code implementations9 Jan 2019 Huimin Lu, Dong Wang, Yujie Li, Jianru Li, Xin Li, Hyoungseop Kim, Seiichi Serikawa, Iztok Humar

The Cognitive Ocean Network (CONet) will become the mainstream of future ocean science and engineering developments.

Adaptive Active Learning for Image Classification

no code implementations CVPR 2013 Xin Li, Yuhong Guo

Recently active learning has attracted a lot of attention in computer vision field, as it is time and cost consuming to prepare a good set of labeled images for vision data analysis.

Active Learning Classification +4

Simplified Mirror-Based Camera Pose Computation via Rotation Averaging

no code implementations CVPR 2015 Gucan Long, Laurent Kneip, Xin Li, Xiaohu Zhang, Qifeng Yu

Our theoretical contribution extends the applicability of rotation averaging to a more general case, and enables mirror-based pose estimation in closed-form under the chordal L2-metric, or in an outlier-robust way by employing iterative L1-norm averaging.

Camera Calibration Pose Estimation

Object-Aware Dense Semantic Correspondence

no code implementations CVPR 2017 Fan Yang, Xin Li, Hong Cheng, Jianping Li, Leiting Chen

To address these problems, this paper proposes an object-aware method to estimate per-pixel correspondences from semantic to low-level by learning a classifier for each selected discriminative grid cell and guiding the localization of every pixel under the semantic constraint.

Object Semantic correspondence

Low-Rank Tensor Approximation With Laplacian Scale Mixture Modeling for Multiframe Image Denoising

no code implementations ICCV 2015 Weisheng Dong, Guangyu Li, Guangming Shi, Xin Li, Yi Ma

Patch-based low-rank models have shown effective in exploiting spatial redundancy of natural images especially for the application of image denoising.

Dictionary Learning Image Denoising

Semi-Supervised Zero-Shot Classification With Label Representation Learning

no code implementations ICCV 2015 Xin Li, Yuhong Guo, Dale Schuurmans

Most existing zero-shot learning methods require a user to first provide a set of semantic visual attributes for each class as side information before applying a two-step prediction procedure that introduces an intermediate attribute prediction problem.

Attribute Classification +4

Iris R-CNN: Accurate Iris Segmentation in Non-cooperative Environment

no code implementations25 Mar 2019 Chunyang Feng, Yufeng Sun, Xin Li

Despite the significant advances in iris segmentation, accomplishing accurate iris segmentation in non-cooperative environment remains a grand challenge.

Iris Segmentation Region Proposal +1

Target-Aware Deep Tracking

no code implementations CVPR 2019 Xin Li, Chao Ma, Baoyuan Wu, Zhenyu He, Ming-Hsuan Yang

Despite demonstrated successes for numerous vision tasks, the contributions of using pre-trained deep features for visual tracking are not as significant as that for object recognition.

Object Object Recognition +1

STN-Homography: estimate homography parameters directly

no code implementations6 Jun 2019 Qiang Zhou, Xin Li

In this paper, we introduce the STN-Homography model to directly estimate the homography matrix between image pair.

Homography Estimation

Vispi: Automatic Visual Perception and Interpretation of Chest X-rays

no code implementations MIDL 2019 Xin Li, Rui Cao, Dongxiao Zhu

Medical imaging contains the essential information for rendering diagnostic and treatment decisions.

Image Captioning

Reconstructing Perceived Images from Brain Activity by Visually-guided Cognitive Representation and Adversarial Learning

no code implementations27 Jun 2019 Ziqi Ren, Jie Li, Xuetong Xue, Xin Li, Fan Yang, Zhicheng Jiao, Xinbo Gao

In addition, we introduce a novel three-stage learning approach which enables the (cognitive) encoder to gradually distill useful knowledge from the paired (visual) encoder during the learning process.

Generative Adversarial Network Image Reconstruction +2

Small and Practical BERT Models for Sequence Labeling

no code implementations IJCNLP 2019 Henry Tsai, Jason Riesa, Melvin Johnson, Naveen Arivazhagan, Xin Li, Amelia Archer

We propose a practical scheme to train a single multilingual sequence labeling model that yields state of the art results and is small and fast enough to run on a single CPU.

Part-Of-Speech Tagging

Iterative Clustering with Game-Theoretic Matching for Robust Multi-consistency Correspondence

no code implementations3 Sep 2019 Chen Zhao, Jiaqi Yang, Ke Xian, Zhiguo Cao, Xin Li

Matching corresponding features between two images is a fundamental task to computer vision with numerous applications in object recognition, robotics, and 3D reconstruction.

3D Reconstruction Clustering +2

Spoofing and Anti-Spoofing with Wax Figure Faces

no code implementations12 Oct 2019 Shan Jia, Xin Li, Chuanbo Hu, Zhengquan Xu

In this work, we introduce a wax figure face database (WFFD) as a novel and super-realistic 3D face presentation attack.

Face Detection Face Recognition +1

Sparse estimation via $\ell_q$ optimization method in high-dimensional linear regression

no code implementations12 Nov 2019 Xin Li, Yaohua Hu, Chong Li, Xiaoqi Yang, Tianzi Jiang

In this paper, we discuss the statistical properties of the $\ell_q$ optimization methods $(0<q\leq 1)$, including the $\ell_q$ minimization method and the $\ell_q$ regularization method, for estimating a sparse parameter from noisy observations in high-dimensional linear regression with either a deterministic or random design.

regression Vocal Bursts Intensity Prediction

Relevance-Promoting Language Model for Short-Text Conversation

no code implementations26 Nov 2019 Xin Li, Piji Li, Wei Bi, Xiaojiang Liu, Wai Lam

In this paper, we propose to formulate the STC task as a language modeling problem and tailor-make a training strategy to adapt a language model for response generation.

Language Modelling Response Generation +1

Digital Twin: Acquiring High-Fidelity 3D Avatar from a Single Image

no code implementations7 Dec 2019 Ruizhe Wang, Chih-Fan Chen, Hao Peng, Xudong Liu, Oliver Liu, Xin Li

We present an approach to generate high fidelity 3D face avatar with a high-resolution UV texture map from a single image.

Face Model Vocal Bursts Intensity Prediction

Cannot find the paper you are looking for? You can Submit a new open access paper.