Search Results for author: Xinyang Jiang

Found 37 papers, 20 papers with code

LLM-RadJudge: Achieving Radiologist-Level Evaluation for X-Ray Report Generation

no code implementations • 1 Apr 2024 • Zilong Wang, Xufang Luo, Xinyang Jiang, Dongsheng Li, Lili Qiu

This study proposes a novel evaluation framework using large language models (LLMs) to compare radiology reports for assessment.

Knowledge Distillation

Paper
Add Code

Understanding Training-free Diffusion Guidance: Mechanisms and Limitations

no code implementations • 19 Mar 2024 • Yifei Shen, Xinyang Jiang, Yezhen Wang, Yifan Yang, Dongqi Han, Dongsheng Li

Adding additional control to pretrained diffusion models has become an increasingly popular research area, with extensive applications in computer vision, reinforcement learning, and AI for science.

Paper
Add Code

DreamDistribution: Prompt Distribution Learning for Text-to-Image Diffusion Models

no code implementations • 21 Dec 2023 • Brian Nlong Zhao, Yuhang Xiao, Jiashu Xu, Xinyang Jiang, Yifan Yang, Dongsheng Li, Laurent Itti, Vibhav Vineet, Yunhao Ge

We introduce a solution that allows a pretrained T2I diffusion model to learn a set of soft prompts, enabling the generation of novel images by sampling prompts from the learned distribution.

Text to 3D

Paper
Add Code

Learning Hierarchical Prompt with Structured Linguistic Knowledge for Vision-Language Models

1 code implementation • 11 Dec 2023 • Yubin Wang, Xinyang Jiang, De Cheng, Dongsheng Li, Cairong Zhao

To address this limitation and prioritize harnessing structured knowledge, this paper advocates for leveraging LLMs to build a graph for each description to model the entities and attributes describing the category, as well as their correlations.

Ranked #1 on Prompt Engineering on ImageNet V2

Prompt Engineering

Paper
Code

Unified Medical Image Pre-training in Language-Guided Common Semantic Space

no code implementations • 24 Nov 2023 • Xiaoxuan He, Yifan Yang, Xinyang Jiang, Xufang Luo, Haoji Hu, Siyun Zhao, Dongsheng Li, Yuqing Yang, Lili Qiu

To overcome the aforementioned challenges, we propose an Unified Medical Image Pre-training framework, namely UniMedI, which utilizes diagnostic reports as common semantic space to create unified representations for diverse modalities of medical images (especially for 2D and 3D images).

Paper
Add Code

Online Video Quality Enhancement with Spatial-Temporal Look-up Tables

no code implementations • 22 Nov 2023 • Zefan Qu, Xinyang Jiang, Yifan Yang, Dongsheng Li, Cairong Zhao

To the best of our knowledge, we are the first to exploit the LUT structure to extract temporal information in video tasks.

Paper
Add Code

AccFlow: Backward Accumulation for Long-Range Optical Flow

1 code implementation • ICCV 2023 • Guangyang Wu, Xiaohong Liu, Kunming Luo, Xi Liu, Qingqing Zheng, Shuaicheng Liu, Xinyang Jiang, Guangtao Zhai, Wenyi Wang

To train and evaluate the proposed AccFlow, we have constructed a large-scale high-quality dataset named CVO, which provides ground-truth optical flow labels between adjacent and distant frames.

Optical Flow Estimation

Paper
Code

Content-Adaptive Auto-Occlusion Network for Occluded Person Re-Identification

1 code implementation • IEEE Transactions on Image Processing 2023 • Cairong Zhao, Zefan Qu, Xinyang Jiang, Yuanpeng Tu, Xiang Bai

To address these challenges, we propose a novel Content-Adaptive Auto-Occlusion Network (CAAO), that is able to dynamically select the proper occlusion region of an image based on its content and the current training status.

Person Re-Identification

Paper
Code

Dissecting Arbitrary-scale Super-resolution Capability from Pre-trained Diffusion Generative Models

no code implementations • 1 Jun 2023 • Ruibin Li, Qihua Zhou, Song Guo, Jie Zhang, Jingcai Guo, Xinyang Jiang, Yifei Shen, Zhenhua Han

Diffusion-based Generative Models (DGMs) have achieved unparalleled performance in synthesizing high-quality visual content, opening up the opportunity to improve image super-resolution (SR) tasks.

Image Super-Resolution

Paper
Add Code

EA-HAS-Bench:Energy-Aware Hyperparameter and Architecture Search Benchmark

1 code implementation • The Eleventh International Conference on Learning Representations 2023 • Shuguang Dou, Xinyang Jiang, Cai Rong Zhao, Dongsheng Li

The energy consumption for training deep learning models is increasing at an alarming rate due to the growth of training data and model scale, resulting in a negative impact on carbon neutrality.

AutoML

Paper
Code

SIMPLE: Specialized Model-Sample Matching for Domain Generalization

1 code implementation • International Conference on Learning Representations 2023 • Ziyue Li, Kan Ren, Xinyang Jiang, Yifei Shen, Haipeng Zhang, Dongsheng Li

Moreover, our method is highly efficient and achieves more than 1000 times training speedup compared to the conventional DG methods with fine-tuning a pretrained model.

Ranked #1 on Domain Generalization on PACS

Domain Generalization

Paper
Code

Online Streaming Video Super-Resolution with Convolutional Look-Up Table

no code implementations • 1 Mar 2023 • Guanghao Yin, Zefan Qu, Xinyang Jiang, Shan Jiang, Zhenhua Han, Ningxin Zheng, Xiaohong Liu, Huan Yang, Yuqing Yang, Dongsheng Li, Lili Qiu

To facilitate the research on this problem, a new benchmark dataset named LDV-WebRTC is constructed based on a real-world online streaming system.

Video Super-Resolution

Paper
Add Code

Unsupervised Video Anomaly Detection for Stereotypical Behaviours in Autism

no code implementations • 27 Feb 2023 • Jiaqi Gao, Xinyang Jiang, Yuqing Yang, Dongsheng Li, Lili Qiu

Correspondingly, we propose a Dual Stream deep model for Stereotypical Behaviours Detection, DS-SBD, based on the temporal trajectory of human poses and the repetition patterns of human actions.

Activity Recognition Anomaly Detection +1

Paper
Add Code

Towards Inference Efficient Deep Ensemble Learning

no code implementations • 29 Jan 2023 • Ziyue Li, Kan Ren, Yifan Yang, Xinyang Jiang, Yuqing Yang, Dongsheng Li

Ensemble methods can deliver surprising performance gains but also bring significantly higher computational costs, e. g., can be up to 2048X in large-scale ensemble tasks.

Ensemble Learning

Paper
Add Code

Human Co-Parsing Guided Alignment for Occluded Person Re-identification

1 code implementation • IEEE Transactions on Image Processing 2022 • Shuguang Dou, Cairong Zhao, Xinyang Jiang, Shanshan Zhang, Wei-Shi Zheng, WangMeng Zuo

Most supervised methods propose to train an extra human parsing model aside from the ReID model with cross-domain human parts annotation, suffering from expensive annotation cost and domain gap; Unsupervised methods integrate a feature clustering-based human parsing process into the ReID model, but lacking supervision signals brings less satisfactory segmentation results.

Ranked #3 on Person Re-Identification on Occluded-DukeMTMC

Human Parsing Person Re-Identification

Paper
Code

Attentive Mask CLIP

1 code implementation • ICCV 2023 • Yifan Yang, Weiquan Huang, Yixuan Wei, Houwen Peng, Xinyang Jiang, Huiqiang Jiang, Fangyun Wei, Yin Wang, Han Hu, Lili Qiu, Yuqing Yang

To address this issue, we propose an attentive token removal approach for CLIP training, which retains tokens with a high semantic correlation to the text description.

Contrastive Learning Retrieval +1

Paper
Code

Learning Domain Invariant Prompt for Vision-Language Models

1 code implementation • 8 Dec 2022 • Cairong Zhao, Yubin Wang, Xinyang Jiang, Yifei Shen, Kaitao Song, Dongsheng Li, Duoqian Miao

Prompt learning is one of the most effective and trending ways to adapt powerful vision-language foundation models like CLIP to downstream datasets by tuning learnable prompt vectors with very few samples.

Ranked #4 on Prompt Engineering on Caltech-101

Domain Generalization Language Modelling +2

Paper
Code

Invisible Backdoor Attack with Dynamic Triggers against Person Re-identification

1 code implementation • 20 Nov 2022 • Wenli Sun, Xinyang Jiang, Shuguang Dou, Dongsheng Li, Duoqian Miao, Cheng Deng, Cairong Zhao

Instead of learning fixed triggers for the target classes from the training set, DT-IBA can dynamically generate new triggers for any unknown identities.

Backdoor Attack Image Steganography +2

Paper
Code

Online Video Super-Resolution with Convolutional Kernel Bypass Graft

no code implementations • 4 Aug 2022 • Jun Xiao, Xinyang Jiang, Ningxin Zheng, Huan Yang, Yifan Yang, Yuqing Yang, Dongsheng Li, Kin-Man Lam

Then, our proposed CKBG method enhances this lightweight base model by bypassing the original network with ``kernel grafts'', which are extra convolutional kernels containing the prior knowledge of external pretrained image SR models.

Transfer Learning Video Super-Resolution

Paper
Add Code

Towards Privacy-Preserving Person Re-identification via Person Identify Shift

no code implementations • 15 Jul 2022 • Shuguang Dou, Xinyang Jiang, Qingsong Zhao, Dongsheng Li, Cairong Zhao

In this paper, we aim to develop a technique that can achieve a good trade-off between privacy protection and data usability for person ReID.

De-identification Person Re-Identification +1

Paper
Add Code

RendNet: Unified 2D/3D Recognizer With Latent Space Rendering

no code implementations • CVPR 2022 • Ruoxi Shi, Xinyang Jiang, Caihua Shan, Yansen Wang, Dongsheng Li

Instead of looking at one format, it is a good solution to utilize the formats of VG and RG together to avoid these shortcomings.

3D Object Recognition Vector Graphics

Paper
Add Code

Privacy-preserving Online AutoML for Domain-Specific Face Detection

no code implementations • CVPR 2022 • Chenqian Yan, Yuge Zhang, Quanlu Zhang, Yaming Yang, Xinyang Jiang, Yuqing Yang, Baoyuan Wang

Thanks to HyperFD, each local task (client) is able to effectively leverage the learning "experience" of previous tasks without uploading raw images to the platform; meanwhile, the meta-feature extractor is continuously learned to better trade off the bias and variance.

AutoML Face Detection +1

Paper
Add Code

Domain Generalization using Pretrained Models without Fine-tuning

no code implementations • 9 Mar 2022 • Ziyue Li, Kan Ren, Xinyang Jiang, Bo Li, Haipeng Zhang, Dongsheng Li

Fine-tuning pretrained models is a common practice in domain generalization (DG) tasks.

Ranked #8 on Domain Generalization on TerraIncognita

Domain Generalization Ensemble Learning

Paper
Add Code

Recognizing Vector Graphics without Rasterization

2 code implementations • NeurIPS 2021 • Xinyang Jiang, Lu Liu, Caihua Shan, Yifei Shen, Xuanyi Dong, Dongsheng Li

In this paper, we consider a different data format for images: vector graphics.

object-detection Object Detection +2

Paper
Code

SANE: Specialization-Aware Neural Network Ensemble

no code implementations • 29 Sep 2021 • Ziyue Li, Kan Ren, Xinyang Jiang, Mingzhe Han, Haipeng Zhang, Dongsheng Li

Real-world data is often generated by some complex distribution, which can be approximated by a composition of multiple simpler distributions.

Ensemble Learning

Paper
Add Code

Full-Cycle Energy Consumption Benchmark for Low-Carbon Computer Vision

no code implementations • 30 Aug 2021 • Bo Li, Xinyang Jiang, Donglin Bai, Yuge Zhang, Ningxin Zheng, Xuanyi Dong, Lu Liu, Yuqing Yang, Dongsheng Li

The energy consumption of deep learning models is increasing at a breathtaking rate, which raises concerns due to potential negative effects on carbon neutrality in the context of global warming and climate change.

Model Compression

Paper
Add Code

Learning 3D Shape Feature for Texture-Insensitive Person Re-Identification

1 code implementation • CVPR 2021 • Jiaxing Chen, Xinyang Jiang, Fudong Wang, Jun Zhang, Feng Zheng, Xing Sun, Wei-Shi Zheng

In this paper, rather than relying on texture based information, we propose to improve the robustness of person ReID against clothing texture by exploiting the information of a person's 3D shape.

Ranked #4 on Person Re-Identification on PRCC

3D Reconstruction Person Re-Identification

Paper
Code

Ask&Confirm: Active Detail Enriching for Cross-Modal Retrieval with Partial Query

1 code implementation • ICCV 2021 • Guanyu Cai, Jun Zhang, Xinyang Jiang, Yifei Gong, Lianghua He, Fufu Yu, Pai Peng, Xiaowei Guo, Feiyue Huang, Xing Sun

However, the performance of existing methods suffers in real life since the user is likely to provide an incomplete description of an image, which often leads to results filled with false positives that fit the incomplete description.

Cross-Modal Retrieval Image Retrieval +1

Paper
Code

Contextual Non-Local Alignment over Full-Scale Representation for Text-Based Person Search

2 code implementations • 8 Jan 2021 • Chenyang Gao, Guanyu Cai, Xinyang Jiang, Feng Zheng, Jun Zhang, Yifei Gong, Pai Peng, Xiaowei Guo, Xing Sun

Secondly, a BERT with locality-constrained attention is proposed to obtain representations of descriptions at different scales.

Ranked #15 on Text based Person Retrieval on CUHK-PEDES

Descriptive Sentence +2

Paper
Code

Learning To Know Where To See: A Visibility-Aware Approach for Occluded Person Re-Identification

no code implementations • ICCV 2021 • Jinrui Yang, Jiawei Zhang, Fufu Yu, Xinyang Jiang, Mengdan Zhang, Xing Sun, Ying-Cong Chen, Wei-Shi Zheng

Several mainstream methods utilize extra cues (e. g., human pose information) to distinguish human parts from obstacles to alleviate the occlusion problem.

Person Re-Identification

Paper
Add Code

One for More: Selecting Generalizable Samples for Generalizable ReID Model

1 code implementation • 10 Dec 2020 • Enwei Zhang, Xinyang Jiang, Hao Cheng, AnCong Wu, Fufu Yu, Ke Li, Xiaowei Guo, Feng Zheng, Wei-Shi Zheng, Xing Sun

Current training objectives of existing person Re-IDentification (ReID) models only ensure that the loss of the model decreases on selected training batch, with no regards to the performance on samples outside the batch.

Person Re-Identification

Paper
Code

Enhancing Unsupervised Video Representation Learning by Decoupling the Scene and the Motion

3 code implementations • 12 Sep 2020 • Jinpeng Wang, Yuting Gao, Ke Li, Jianguo Hu, Xinyang Jiang, Xiaowei Guo, Rongrong Ji, Xing Sun

Specifically, we construct a positive clip and a negative clip for each video.

Action Recognition Representation Learning

113

Paper
Code

Devil's in the Details: Aligning Visual Clues for Conditional Embedding in Person Re-Identification

1 code implementation • 11 Sep 2020 • Fufu Yu, Xinyang Jiang, Yifei Gong, Shizhen Zhao, Xiaowei Guo, Wei-Shi Zheng, Feng Zheng, Xing Sun

Secondly, the Conditional Feature Embedding requires the overall feature of a query image to be dynamically adjusted based on the gallery image it matches, while most of the existing methods ignore the reference images.

Ranked #1 on Person Re-Identification on CUHK03-C

Person Re-Identification

Paper
Code

Do Not Disturb Me: Person Re-identification Under the Interference of Other Pedestrians

1 code implementation • ECCV 2020 • Shizhen Zhao, Changxin Gao, Jun Zhang, Hao Cheng, Chuchu Han, Xinyang Jiang, Xiaowei Guo, Wei-Shi Zheng, Nong Sang, Xing Sun

In the conventional person Re-ID setting, it is widely assumed that cropped person images are for each individual.

Person Re-Identification Retrieval

Paper
Code

Viewpoint-Aware Loss with Angular Regularization for Person Re-Identification

1 code implementation • 3 Dec 2019 • Zhihui Zhu, Xinyang Jiang, Feng Zheng, Xiaowei Guo, Feiyue Huang, Wei-Shi Zheng, Xing Sun

Instead of one subspace for each viewpoint, our method projects the feature from different viewpoints into a unified hypersphere and effectively models the feature distribution on both the identity-level and the viewpoint-level.

Ranked #5 on Person Re-Identification on Market-1501 (using extra training data)

Person Re-Identification

Paper
Code

Rethinking Temporal Fusion for Video-based Person Re-identification on Semantic and Time Aspect

2 code implementations • 28 Nov 2019 • Xinyang Jiang, Yifei Gong, Xiaowei Guo, Qize Yang, Feiyue Huang, Wei-Shi Zheng, Feng Zheng, Xing Sun

Recently, the research interest of person re-identification (ReID) has gradually turned to video-based methods, which acquire a person representation by aggregating frame features of an entire video.

Video-Based Person Re-Identification

Paper
Code

Pyramidal Person Re-IDentification via Multi-Loss Dynamic Training

1 code implementation • CVPR 2019 • Feng Zheng, Cheng Deng, Xing Sun, Xinyang Jiang, Xiaowei Guo, Zongqiao Yu, Feiyue Huang, Rongrong Ji

Most existing Re-IDentification (Re-ID) methods are highly dependent on precise bounding boxes that enable images to be aligned with each other.

Ranked #2 on Person Re-Identification on CUHK03-C

Person Re-Identification

Paper
Code

Cannot find the paper you are looking for? You can Submit a new open access paper.