Search Results for author: Li Xu

Found 49 papers, 10 papers with code

Diff-Tracker: Text-to-Image Diffusion Models are Unsupervised Trackers

no code implementations11 Jul 2024 Zhengbo Zhang, Li Xu, Duo Peng, Hossein Rahmani, Jun Liu

We introduce Diff-Tracker, a novel approach for the challenging unsupervised visual tracking task leveraging the pre-trained text-to-image diffusion model.

Visual Tracking

Active Learning Enabled Low-cost Cell Image Segmentation Using Bounding Box Annotation

no code implementations2 May 2024 Yu Zhu, Qiang Yang, Li Xu

Cell image segmentation is usually implemented using fully supervised deep learning methods, which heavily rely on extensive annotated training data.

Active Learning Cell Segmentation +3

6D-Diff: A Keypoint Diffusion Framework for 6D Object Pose Estimation

no code implementations CVPR 2024 Li Xu, Haoxuan Qu, Yujun Cai, Jun Liu

Estimating the 6D object pose from a single RGB image often involves noise and indeterminacy due to challenges such as occlusions and cluttered backgrounds.

6D Pose Estimation using RGB Denoising +1

Trustworthy Large Models in Vision: A Survey

no code implementations16 Nov 2023 Ziyan Guo, Li Xu, Jun Liu

The rapid progress of Large Models (LMs) has recently revolutionized various fields of deep learning with remarkable grades, ranging from Natural Language Processing (NLP) to Computer Vision (CV).

Survey

Deep Neural Network Identification of Limnonectes Species and New Class Detection Using Image Data

no code implementations15 Nov 2023 Li Xu, Yili Hong, Eric P. Smith, David S. McLeod, Xinwei Deng, Laura J. Freeman

We demonstrate that deep neural networks can successfully automate the classification of an image into a known species group for which it has been trained.

Network Identification Out of Distribution (OOD) Detection

Bridged-GNN: Knowledge Bridge Learning for Effective Knowledge Transfer

no code implementations18 Aug 2023 Wendong Bi, Xueqi Cheng, Bingbing Xu, Xiaoqian Sun, Li Xu, HuaWei Shen

Transfer learning has been a feasible way to transfer knowledge from high-quality external data of source domains to limited data of target domains, which follows a domain-level knowledge transfer to learn a shared posterior distribution.

GRAPH DOMAIN ADAPTATION Retrieval +1

Dual Inverse Degradation Network for Real-World SDRTV-to-HDRTV Conversion

no code implementations7 Jul 2023 Kepeng Xu, Li Xu, Gang He, Xianyun Wu, Zhiqiang Zhang, Wenxin Yu, Yunsong Li

In this study, we address the emerging necessity of converting Standard Dynamic Range Television (SDRTV) content into High Dynamic Range Television (HDRTV) in light of the limited number of native HDRTV content.

Tone Mapping Video Restoration

Multi-modal Pre-training for Medical Vision-language Understanding and Generation: An Empirical Study with A New Benchmark

1 code implementation10 Jun 2023 Li Xu, Bo Liu, Ameer Hamza Khan, Lu Fan, Xiao-Ming Wu

With the availability of large-scale, comprehensive, and general-purpose vision-language (VL) datasets such as MSCOCO, vision-language pre-training (VLP) has become an active area of research and proven to be effective for various VL tasks such as visual-question answering.

Image-text Retrieval Medical Report Generation +3

Meta Compositional Referring Expression Segmentation

no code implementations CVPR 2023 Li Xu, Mark He Huang, Xindi Shang, Zehuan Yuan, Ying Sun, Jun Liu

Then, following a novel meta optimization scheme to optimize the model to obtain good testing performance on the virtual testing sets after training on the virtual training set, our framework can effectively drive the model to better capture semantics and visual representations of individual concepts, and thus obtain robust generalization performance even when handling novel compositions.

Meta-Learning Referring Expression +2

Predicting the Silent Majority on Graphs: Knowledge Transferable Graph Neural Network

1 code implementation2 Feb 2023 Wendong Bi, Bingbing Xu, Xiaoqian Sun, Li Xu, HuaWei Shen, Xueqi Cheng

To combat the above challenges, we propose Knowledge Transferable Graph Neural Network (KT-GNN), which models distribution shifts during message passing and representation learning by transferring knowledge from vocal nodes to silent nodes.

Graph Neural Network Representation Learning

SDRTV-to-HDRTV Conversion via Spatial-Temporal Feature Fusion

no code implementations4 Nov 2022 Kepeng Xu, Li Xu, Gang He, Chang Wu, Zijia Ma, Ming Sun, Yu-Wing Tai

To evaluate the performance of the proposed method, we construct a corresponding multi-frame dataset using HDR video of the HDR10 standard to conduct a comprehensive evaluation of different methods.

Heatmap Distribution Matching for Human Pose Estimation

no code implementations3 Oct 2022 Haoxuan Qu, Li Xu, Yujun Cai, Lin Geng Foo, Jun Liu

In this paper, we show that optimizing the heatmap prediction in such a way, the model performance of body joint localization, which is the intrinsic objective of this task, may not be consistently improved during the optimization process of the heatmap prediction.

2D Human Pose Estimation Pose Estimation

Global Priors Guided Modulation Network for Joint Super-Resolution and Inverse Tone-Mapping

no code implementations14 Aug 2022 Gang He, Shaoyi Long, Li Xu, Chang Wu, Jinjia Zhou, Ming Sun, Xing Wen, Yurong Dai

Joint super-resolution and inverse tone-mapping (SR-ITM) aims to enhance the visual quality of videos that have quality deficiencies in resolution and dynamic range.

4k Super-Resolution +1

Meta Spatio-Temporal Debiasing for Video Scene Graph Generation

no code implementations23 Jul 2022 Li Xu, Haoxuan Qu, Jason Kuen, Jiuxiang Gu, Jun Liu

Video scene graph generation (VidSGG) aims to parse the video content into scene graphs, which involves modeling the spatio-temporal contextual information in the video.

Graph Generation Meta-Learning +2

NTIRE 2022 Challenge on High Dynamic Range Imaging: Methods and Results

no code implementations25 May 2022 Eduardo Pérez-Pellitero, Sibi Catley-Chandar, Richard Shaw, Aleš Leonardis, Radu Timofte, Zexin Zhang, Cen Liu, Yunbo Peng, Yue Lin, Gaocheng Yu, Jin Zhang, Zhe Ma, Hongbin Wang, Xiangyu Chen, Xintao Wang, Haiwei Wu, Lin Liu, Chao Dong, Jiantao Zhou, Qingsen Yan, Song Zhang, Weiye Chen, Yuhang Liu, Zhen Zhang, Yanning Zhang, Javen Qinfeng Shi, Dong Gong, Dan Zhu, Mengdi Sun, Guannan Chen, Yang Hu, Haowei Li, Baozhu Zou, Zhen Liu, Wenjie Lin, Ting Jiang, Chengzhi Jiang, Xinpeng Li, Mingyan Han, Haoqiang Fan, Jian Sun, Shuaicheng Liu, Juan Marín-Vega, Michael Sloth, Peter Schneider-Kamp, Richard Röttger, Chunyang Li, Long Bao, Gang He, Ziyao Xu, Li Xu, Gen Zhan, Ming Sun, Xing Wen, Junlin Li, Shuang Feng, Fei Lei, Rui Liu, Junxiang Ruan, Tianhong Dai, Wei Li, Zhan Lu, Hengyan Liu, Peian Huang, Guangyu Ren, Yonglin Luo, Chang Liu, Qiang Tu, Fangya Li, Ruipeng Gang, Chenghua Li, Jinjing Li, Sai Ma, Chenming Liu, Yizhen Cao, Steven Tel, Barthelemy Heyrman, Dominique Ginhac, Chul Lee, Gahyeon Kim, Seonghyun Park, An Gia Vien, Truong Thanh Nhat Mai, Howoon Yoon, Tu Vo, Alexander Holston, Sheir Zaheer, Chan Y. Park

The challenge is composed of two tracks with an emphasis on fidelity and complexity constraints: In Track 1, participants are asked to optimize objective fidelity scores while imposing a low-complexity constraint (i. e. solutions can not exceed a given number of operations).

Image Restoration Vocal Bursts Intensity Prediction

Transcoded Video Restoration by Temporal Spatial Auxiliary Network

1 code implementation15 Dec 2021 Li Xu, Gang He, Jinjia Zhou, Jie Lei, Weiying Xie, Yunsong Li, Yu-Wing Tai

In most video platforms, such as Youtube, and TikTok, the played videos usually have undergone multiple video encodings such as hardware encoding by recording devices, software encoding by video editing apps, and single/multiple video transcoding by video application servers.

Video Editing Video Restoration

Statistical Perspectives on Reliability of Artificial Intelligence Systems

no code implementations9 Nov 2021 Yili Hong, Jiayi Lian, Li Xu, Jie Min, Yueyao Wang, Laura J. Freeman, Xinwei Deng

We also describe recent developments in modeling and analysis of AI reliability and outline statistical research challenges in this area, including out-of-distribution detection, the effect of the training set, adversarial attacks, model accuracy, and uncertainty quantification, and discuss how those topics can be related to AI reliability, with illustrative examples.

Out-of-Distribution Detection Uncertainty Quantification

Recent Advances of Continual Learning in Computer Vision: An Overview

no code implementations23 Sep 2021 Haoxuan Qu, Hossein Rahmani, Li Xu, Bryan Williams, Jun Liu

In contrast to batch learning where all training data is available at once, continual learning represents a family of methods that accumulate knowledge and learn continuously with data available in sequential order.

Continual Learning Knowledge Distillation

SUTD-TrafficQA: A Question Answering Benchmark and an Efficient Network for Video Reasoning over Traffic Events

3 code implementations CVPR 2021 Li Xu, He Huang, Jun Liu

In this paper, we create a novel dataset, SUTD-TrafficQA (Traffic Question Answering), which takes the form of video QA based on the collected 10, 080 in-the-wild videos and annotated 62, 535 QA pairs, for benchmarking the cognitive capability of causal inference and event understanding models in complex traffic scenarios.

Autonomous Vehicles Benchmarking +4

Unifying deterministic and stochastic ecological dynamics via a landscape-flux approach

no code implementations15 Mar 2021 Li Xu, Denis Patterson, Ann Carla Staver, Simon Asher Levin, Jin Wang

We develop a landscape-flux framework to investigate observed frequency distributions of vegetation and the stability of these ecological systems under fluctuations.

Modelling Universal Order Book Dynamics in Bitcoin Market

no code implementations15 Jan 2021 Fabin Shi, Nathan Aden, Shengda Huang, Neil Johnson, Xiaoqian Sun, Jinhua Gao, Li Xu, HuaWei Shen, Xueqi Cheng, Chaoming Song

Understanding the emergence of universal features such as the stylized facts in markets is a long-standing challenge that has drawn much attention from economists and physicists.

Learning to Benchmark: Determining Best Achievable Misclassification Error from Training Data

2 code implementations16 Sep 2019 Morteza Noshad, Li Xu, Alfred Hero

In this problem the objective is to establish statistically consistent estimates of the Bayes misclassification error rate without having to learn a Bayes-optimal classifier.

Effective Domain Knowledge Transfer with Soft Fine-tuning

no code implementations5 Sep 2019 Zhichen Zhao, Bo-Wen Zhang, Yuning Jiang, Li Xu, Lei LI, Wei-Ying Ma

However, the datasets from source domain are simply discarded in the fine-tuning process.

Transfer Learning

Multi-Antenna Channel Interpolation via Tucker Decomposed Extreme Learning Machine

no code implementations26 Dec 2018 Han Zhang, Bo Ai, Wenjun Xu, Li Xu, Shuguang Cui

Channel interpolation is an essential technique for providing high-accuracy estimation of the channel state information (CSI) for wireless systems design where the frequency-space structural correlations of multi-antenna channel are typically hidden in matrix or tensor forms.

Structure-Preserving Image Super-resolution via Contextualized Multi-task Learning

no code implementations26 Jul 2017 Yukai Shi, Keze Wang, Chongyu Chen, Li Xu, Liang Lin

Single image super resolution (SR), which refers to reconstruct a higher-resolution (HR) image from the observed low-resolution (LR) image, has received substantial attention due to its tremendous application potentials.

Computational Efficiency Image Restoration +2

Local- and Holistic- Structure Preserving Image Super Resolution via Deep Joint Component Learning

no code implementations25 Jul 2016 Yukai Shi, Keze Wang, Li Xu, Liang Lin

Recently, machine learning based single image super resolution (SR) approaches focus on jointly learning representations for high-resolution (HR) and low-resolution (LR) image patch pairs to improve the quality of the super-resolved images.

Image Super-Resolution Representation Learning

Look, Listen and Learn - A Multimodal LSTM for Speaker Identification

no code implementations13 Feb 2016 Jimmy Ren, Yongtao Hu, Yu-Wing Tai, Chuan Wang, Li Xu, Wenxiu Sun, Qiong Yan

This task not only requires collective perception over both visual and auditory signals, the robustness to handle severe quality degradations and unconstrained content variations are also indispensable.

Speaker Identification

Mutual-Structure for Joint Filtering

no code implementations ICCV 2015 Xiaoyong Shen, Chao Zhou, Li Xu, Jiaya Jia

Previous joint/guided filters directly transfer the structural information in the reference image to the target one.

Depth Completion Image Enhancement +3

Shepard Convolutional Neural Networks

1 code implementation NeurIPS 2015 Jimmy SJ. Ren, Li Xu, Qiong Yan, Wenxiu Sun

In this paper, we draw on Shepard interpolation and design Shepard Convolutional Neural Networks (ShCNN) which efficiently realizes end-to-end trainable TVI operators in the network.

Deep Learning Image Inpainting +2

Deep Multimodal Speaker Naming

no code implementations17 Jul 2015 Yongtao Hu, Jimmy Ren, Jingwen Dai, Chang Yuan, Li Xu, Wenping Wang

Automatic speaker naming is the problem of localizing as well as identifying each speaking character in a TV/movie/live show video.

Face Alignment

Just Noticeable Defocus Blur Detection and Estimation

no code implementations CVPR 2015 Jianping Shi, Li Xu, Jiaya Jia

We tackle a fundamental problem to detect and estimate just noticeable blur (JNB) caused by defocus that spans a small number of pixels in images.

Defocus Blur Detection

On Vectorization of Deep Convolutional Neural Networks for Vision Tasks

no code implementations29 Jan 2015 Jimmy SJ. Ren, Li Xu

We recently have witnessed many ground-breaking results in machine learning and computer vision, generated by using deep convolutional neural networks (CNN).

An Evasion and Counter-Evasion Study in Malicious Websites Detection

no code implementations8 Aug 2014 Li Xu, Zhenxin Zhan, Shouhuai Xu, Keyin Ye

Within this framework, we show that an adaptive attacker can make malicious websites evade powerful detection models, but proactive training can be an effective counter-evasion defense mechanism.

100+ Times Faster Weighted Median Filter (WMF)

no code implementations CVPR 2014 Qi Zhang, Li Xu, Jiaya Jia

Weighted median, in the form of either solver or filter, has been employed in a wide range of computer vision solutions for its beneficial properties in sparsity representation.

2D Semantic Segmentation task 1 (8 classes) Optical Flow Estimation +2

Discriminative Blur Detection Features

no code implementations CVPR 2014 Jianping Shi, Li Xu, Jiaya Jia

Ubiquitous image blur brings out a practically important question – what are effective features to differentiate between blurred and unblurred image regions.

Deblurring

Joint Depth Estimation and Camera Shake Removal from Single Blurry Image

no code implementations CVPR 2014 Zhe Hu, Li Xu, Ming-Hsuan Yang

The non-uniform blur effect is not only caused by the camera motion, but also the depth variation of the scene.

Deblurring Depth Estimation +1

Dense Scattering Layer Removal

no code implementations13 Oct 2013 Qiong Yan, Li Xu, Jiaya Jia

We propose a new model, together with advanced optimization, to separate a thick scattering media layer from a single natural image.

Hierarchical Saliency Detection

no code implementations CVPR 2013 Qiong Yan, Li Xu, Jianping Shi, Jiaya Jia

When dealing with objects with complex structures, saliency detection confronts a critical problem namely that detection accuracy could be adversely affected if salient foreground or background in an image contains small-scale high-contrast patterns.

Saliency Detection

Unnatural L0 Sparse Representation for Natural Image Deblurring

no code implementations CVPR 2013 Li Xu, Shicheng Zheng, Jiaya Jia

We show in this paper that the success of previous maximum a posterior (MAP) based blur removal methods partly stems from their respective intermediate steps, which implicitly or explicitly create an unnatural representation containing salient image structures.

Ranked #14 on Deblurring on RealBlur-R (trained on GoPro) (SSIM (sRGB) metric)

Deblurring Image Deblurring

Cannot find the paper you are looking for? You can Submit a new open access paper.