Search Results for author: Xinyang Jiang

Found 38 papers, 20 papers with code

LLM-RadJudge: Achieving Radiologist-Level Evaluation for X-Ray Report Generation

no code implementations1 Apr 2024 Zilong Wang, Xufang Luo, Xinyang Jiang, Dongsheng Li, Lili Qiu

This study proposes a novel evaluation framework using large language models (LLMs) to compare radiology reports for assessment.

Knowledge Distillation

Understanding and Improving Training-free Loss-based Diffusion Guidance

no code implementations19 Mar 2024 Yifei Shen, Xinyang Jiang, Yezhen Wang, Yifan Yang, Dongqi Han, Dongsheng Li

Adding additional control to pretrained diffusion models has become an increasingly popular research area, with extensive applications in computer vision, reinforcement learning, and AI for science.

DreamDistribution: Prompt Distribution Learning for Text-to-Image Diffusion Models

no code implementations21 Dec 2023 Brian Nlong Zhao, Yuhang Xiao, Jiashu Xu, Xinyang Jiang, Yifan Yang, Dongsheng Li, Laurent Itti, Vibhav Vineet, Yunhao Ge

We introduce a solution that allows a pretrained T2I diffusion model to learn a set of soft prompts, enabling the generation of novel images by sampling prompts from the learned distribution.

Text to 3D

Learning Hierarchical Prompt with Structured Linguistic Knowledge for Vision-Language Models

1 code implementation11 Dec 2023 Yubin Wang, Xinyang Jiang, De Cheng, Dongsheng Li, Cairong Zhao

To address this limitation and prioritize harnessing structured knowledge, this paper advocates for leveraging LLMs to build a graph for each description to model the entities and attributes describing the category, as well as their correlations.

Prompt Engineering

Unified Medical Image Pre-training in Language-Guided Common Semantic Space

no code implementations24 Nov 2023 Xiaoxuan He, Yifan Yang, Xinyang Jiang, Xufang Luo, Haoji Hu, Siyun Zhao, Dongsheng Li, Yuqing Yang, Lili Qiu

To overcome the aforementioned challenges, we propose an Unified Medical Image Pre-training framework, namely UniMedI, which utilizes diagnostic reports as common semantic space to create unified representations for diverse modalities of medical images (especially for 2D and 3D images).

Online Video Quality Enhancement with Spatial-Temporal Look-up Tables

no code implementations22 Nov 2023 Zefan Qu, Xinyang Jiang, Yifan Yang, Dongsheng Li, Cairong Zhao

To the best of our knowledge, we are the first to exploit the LUT structure to extract temporal information in video tasks.

AccFlow: Backward Accumulation for Long-Range Optical Flow

1 code implementation ICCV 2023 Guangyang Wu, Xiaohong Liu, Kunming Luo, Xi Liu, Qingqing Zheng, Shuaicheng Liu, Xinyang Jiang, Guangtao Zhai, Wenyi Wang

To train and evaluate the proposed AccFlow, we have constructed a large-scale high-quality dataset named CVO, which provides ground-truth optical flow labels between adjacent and distant frames.

Optical Flow Estimation

Content-Adaptive Auto-Occlusion Network for Occluded Person Re-Identification

1 code implementation IEEE Transactions on Image Processing 2023 Cairong Zhao, Zefan Qu, Xinyang Jiang, Yuanpeng Tu, Xiang Bai

To address these challenges, we propose a novel Content-Adaptive Auto-Occlusion Network (CAAO), that is able to dynamically select the proper occlusion region of an image based on its content and the current training status.

Person Re-Identification

Dissecting Arbitrary-scale Super-resolution Capability from Pre-trained Diffusion Generative Models

no code implementations1 Jun 2023 Ruibin Li, Qihua Zhou, Song Guo, Jie Zhang, Jingcai Guo, Xinyang Jiang, Yifei Shen, Zhenhua Han

Diffusion-based Generative Models (DGMs) have achieved unparalleled performance in synthesizing high-quality visual content, opening up the opportunity to improve image super-resolution (SR) tasks.

Image Super-Resolution

SIMPLE: Specialized Model-Sample Matching for Domain Generalization

1 code implementation International Conference on Learning Representations 2023 Ziyue Li, Kan Ren, Xinyang Jiang, Yifei Shen, Haipeng Zhang, Dongsheng Li

Moreover, our method is highly efficient and achieves more than 1000 times training speedup compared to the conventional DG methods with fine-tuning a pretrained model.

Domain Generalization

EA-HAS-Bench:Energy-Aware Hyperparameter and Architecture Search Benchmark

1 code implementation The Eleventh International Conference on Learning Representations 2023 Shuguang Dou, Xinyang Jiang, Cai Rong Zhao, Dongsheng Li

The energy consumption for training deep learning models is increasing at an alarming rate due to the growth of training data and model scale, resulting in a negative impact on carbon neutrality.

AutoML

Online Streaming Video Super-Resolution with Convolutional Look-Up Table

no code implementations1 Mar 2023 Guanghao Yin, Zefan Qu, Xinyang Jiang, Shan Jiang, Zhenhua Han, Ningxin Zheng, Xiaohong Liu, Huan Yang, Yuqing Yang, Dongsheng Li, Lili Qiu

To facilitate the research on this problem, a new benchmark dataset named LDV-WebRTC is constructed based on a real-world online streaming system.

Video Super-Resolution

Unsupervised Video Anomaly Detection for Stereotypical Behaviours in Autism

no code implementations27 Feb 2023 Jiaqi Gao, Xinyang Jiang, Yuqing Yang, Dongsheng Li, Lili Qiu

Correspondingly, we propose a Dual Stream deep model for Stereotypical Behaviours Detection, DS-SBD, based on the temporal trajectory of human poses and the repetition patterns of human actions.

Activity Recognition Anomaly Detection +1

Towards Inference Efficient Deep Ensemble Learning

no code implementations29 Jan 2023 Ziyue Li, Kan Ren, Yifan Yang, Xinyang Jiang, Yuqing Yang, Dongsheng Li

Ensemble methods can deliver surprising performance gains but also bring significantly higher computational costs, e. g., can be up to 2048X in large-scale ensemble tasks.

Ensemble Learning

Human Co-Parsing Guided Alignment for Occluded Person Re-identification

1 code implementation IEEE Transactions on Image Processing 2022 Shuguang Dou, Cairong Zhao, Xinyang Jiang, Shanshan Zhang, Wei-Shi Zheng, WangMeng Zuo

Most supervised methods propose to train an extra human parsing model aside from the ReID model with cross-domain human parts annotation, suffering from expensive annotation cost and domain gap; Unsupervised methods integrate a feature clustering-based human parsing process into the ReID model, but lacking supervision signals brings less satisfactory segmentation results.

Human Parsing Person Re-Identification

Attentive Mask CLIP

1 code implementation ICCV 2023 Yifan Yang, Weiquan Huang, Yixuan Wei, Houwen Peng, Xinyang Jiang, Huiqiang Jiang, Fangyun Wei, Yin Wang, Han Hu, Lili Qiu, Yuqing Yang

To address this issue, we propose an attentive token removal approach for CLIP training, which retains tokens with a high semantic correlation to the text description.

Contrastive Learning Retrieval +1

Learning Domain Invariant Prompt for Vision-Language Models

1 code implementation8 Dec 2022 Cairong Zhao, Yubin Wang, Xinyang Jiang, Yifei Shen, Kaitao Song, Dongsheng Li, Duoqian Miao

Prompt learning is one of the most effective and trending ways to adapt powerful vision-language foundation models like CLIP to downstream datasets by tuning learnable prompt vectors with very few samples.

Domain Generalization Language Modelling +2

Invisible Backdoor Attack with Dynamic Triggers against Person Re-identification

1 code implementation20 Nov 2022 Wenli Sun, Xinyang Jiang, Shuguang Dou, Dongsheng Li, Duoqian Miao, Cheng Deng, Cairong Zhao

Instead of learning fixed triggers for the target classes from the training set, DT-IBA can dynamically generate new triggers for any unknown identities.

Backdoor Attack Image Steganography +2

Online Video Super-Resolution with Convolutional Kernel Bypass Graft

no code implementations4 Aug 2022 Jun Xiao, Xinyang Jiang, Ningxin Zheng, Huan Yang, Yifan Yang, Yuqing Yang, Dongsheng Li, Kin-Man Lam

Then, our proposed CKBG method enhances this lightweight base model by bypassing the original network with ``kernel grafts'', which are extra convolutional kernels containing the prior knowledge of external pretrained image SR models.

Transfer Learning Video Super-Resolution

Towards Privacy-Preserving Person Re-identification via Person Identify Shift

no code implementations15 Jul 2022 Shuguang Dou, Xinyang Jiang, Qingsong Zhao, Dongsheng Li, Cairong Zhao

In this paper, we aim to develop a technique that can achieve a good trade-off between privacy protection and data usability for person ReID.

De-identification Person Re-Identification +1

RendNet: Unified 2D/3D Recognizer With Latent Space Rendering

no code implementations CVPR 2022 Ruoxi Shi, Xinyang Jiang, Caihua Shan, Yansen Wang, Dongsheng Li

Instead of looking at one format, it is a good solution to utilize the formats of VG and RG together to avoid these shortcomings.

3D Object Recognition Vector Graphics

Privacy-preserving Online AutoML for Domain-Specific Face Detection

no code implementations CVPR 2022 Chenqian Yan, Yuge Zhang, Quanlu Zhang, Yaming Yang, Xinyang Jiang, Yuqing Yang, Baoyuan Wang

Thanks to HyperFD, each local task (client) is able to effectively leverage the learning "experience" of previous tasks without uploading raw images to the platform; meanwhile, the meta-feature extractor is continuously learned to better trade off the bias and variance.

AutoML Face Detection +1

SANE: Specialization-Aware Neural Network Ensemble

no code implementations29 Sep 2021 Ziyue Li, Kan Ren, Xinyang Jiang, Mingzhe Han, Haipeng Zhang, Dongsheng Li

Real-world data is often generated by some complex distribution, which can be approximated by a composition of multiple simpler distributions.

Ensemble Learning

Full-Cycle Energy Consumption Benchmark for Low-Carbon Computer Vision

no code implementations30 Aug 2021 Bo Li, Xinyang Jiang, Donglin Bai, Yuge Zhang, Ningxin Zheng, Xuanyi Dong, Lu Liu, Yuqing Yang, Dongsheng Li

The energy consumption of deep learning models is increasing at a breathtaking rate, which raises concerns due to potential negative effects on carbon neutrality in the context of global warming and climate change.

Model Compression

Learning 3D Shape Feature for Texture-Insensitive Person Re-Identification

1 code implementation CVPR 2021 Jiaxing Chen, Xinyang Jiang, Fudong Wang, Jun Zhang, Feng Zheng, Xing Sun, Wei-Shi Zheng

In this paper, rather than relying on texture based information, we propose to improve the robustness of person ReID against clothing texture by exploiting the information of a person's 3D shape.

3D Reconstruction Person Re-Identification

Ask&Confirm: Active Detail Enriching for Cross-Modal Retrieval with Partial Query

1 code implementation ICCV 2021 Guanyu Cai, Jun Zhang, Xinyang Jiang, Yifei Gong, Lianghua He, Fufu Yu, Pai Peng, Xiaowei Guo, Feiyue Huang, Xing Sun

However, the performance of existing methods suffers in real life since the user is likely to provide an incomplete description of an image, which often leads to results filled with false positives that fit the incomplete description.

Cross-Modal Retrieval Image Retrieval +1

Learning To Know Where To See: A Visibility-Aware Approach for Occluded Person Re-Identification

no code implementations ICCV 2021 Jinrui Yang, Jiawei Zhang, Fufu Yu, Xinyang Jiang, Mengdan Zhang, Xing Sun, Ying-Cong Chen, Wei-Shi Zheng

Several mainstream methods utilize extra cues (e. g., human pose information) to distinguish human parts from obstacles to alleviate the occlusion problem.

Person Re-Identification

One for More: Selecting Generalizable Samples for Generalizable ReID Model

1 code implementation10 Dec 2020 Enwei Zhang, Xinyang Jiang, Hao Cheng, AnCong Wu, Fufu Yu, Ke Li, Xiaowei Guo, Feng Zheng, Wei-Shi Zheng, Xing Sun

Current training objectives of existing person Re-IDentification (ReID) models only ensure that the loss of the model decreases on selected training batch, with no regards to the performance on samples outside the batch.

Person Re-Identification

Devil's in the Details: Aligning Visual Clues for Conditional Embedding in Person Re-Identification

1 code implementation11 Sep 2020 Fufu Yu, Xinyang Jiang, Yifei Gong, Shizhen Zhao, Xiaowei Guo, Wei-Shi Zheng, Feng Zheng, Xing Sun

Secondly, the Conditional Feature Embedding requires the overall feature of a query image to be dynamically adjusted based on the gallery image it matches, while most of the existing methods ignore the reference images.

Person Re-Identification

Viewpoint-Aware Loss with Angular Regularization for Person Re-Identification

1 code implementation3 Dec 2019 Zhihui Zhu, Xinyang Jiang, Feng Zheng, Xiaowei Guo, Feiyue Huang, Wei-Shi Zheng, Xing Sun

Instead of one subspace for each viewpoint, our method projects the feature from different viewpoints into a unified hypersphere and effectively models the feature distribution on both the identity-level and the viewpoint-level.

Ranked #5 on Person Re-Identification on Market-1501 (using extra training data)

Person Re-Identification

Rethinking Temporal Fusion for Video-based Person Re-identification on Semantic and Time Aspect

2 code implementations28 Nov 2019 Xinyang Jiang, Yifei Gong, Xiaowei Guo, Qize Yang, Feiyue Huang, Wei-Shi Zheng, Feng Zheng, Xing Sun

Recently, the research interest of person re-identification (ReID) has gradually turned to video-based methods, which acquire a person representation by aggregating frame features of an entire video.

Video-Based Person Re-Identification

Pyramidal Person Re-IDentification via Multi-Loss Dynamic Training

1 code implementation CVPR 2019 Feng Zheng, Cheng Deng, Xing Sun, Xinyang Jiang, Xiaowei Guo, Zongqiao Yu, Feiyue Huang, Rongrong Ji

Most existing Re-IDentification (Re-ID) methods are highly dependent on precise bounding boxes that enable images to be aligned with each other.

Person Re-Identification

Cannot find the paper you are looking for? You can Submit a new open access paper.