Search Results for author: Qingsong Xie

Found 15 papers, 8 papers with code

NTIRE 2025 challenge on Text to Image Generation Model Quality Assessment

no code implementations22 May 2025 Shuhao Han, Haotian Fan, Fangyuan Kong, Wenjie Liao, Chunle Guo, Chongyi Li, Radu Timofte, Liang Li, Tao Li, Junhui Cui, Yunqiu Wang, Yang Tai, Jingwei Sun, Jianhui Sun, Xinli Yue, Tianyi Wang, Huan Hou, Junda Lu, Xinyang Huang, Zitang Zhou, Zijian Zhang, Xuhui Zheng, Xuecheng Wu, Chong Peng, Xuezhi Cao, Trong-Hieu Nguyen-Mau, Minh-Hoang Le, Minh-Khoa Le-Phan, Duy-Nam Ly, Hai-Dang Nguyen, Minh-Triet Tran, Yukang Lin, Yan Hong, Chuanbiao Song, Siyuan Li, Jun Lan, Zhichao Zhang, Xinyue Li, Wei Sun, ZiCheng Zhang, Yunhao Li, Xiaohong Liu, Guangtao Zhai, Zitong Xu, Huiyu Duan, Jiarui Wang, Guangji Ma, Liu Yang, Lu Liu, Qiang Hu, Xiongkuo Min, Zichuan Wang, Zhenchen Tang, Bo Peng, Jing Dong, Fengbin Guan, Zihao Yu, Yiting Lu, Wei Luo, Xin Li, Minhao Lin, Haofeng Chen, Xuanxuan He, Kele Xu, Qisheng Xu, Zijian Gao, Tianjiao Wan, Bo-Cheng Qiu, Chih-Chung Hsu, Chia-Ming Lee, Yu-Fan Lin, Bo Yu, Zehao Wang, Da Mu, Mingxiu Chen, Junkang Fang, Huamei Sun, Wending Zhao, Zhiyu Wang, Wang Liu, Weikang Yu, Puhong Duan, Bin Sun, Xudong Kang, Shutao Li, Shuai He, Lingzhi Fu, Heng Cong, Rongyu Zhang, Jiarong He, Zhishan Qiao, Yongqing Huang, Zewen Chen, Zhe Pang, Juan Wang, Jian Guo, Zhizhuo Shao, Ziyu Feng, Bing Li, Weiming Hu, Hesong Li, Dehua Liu, Zeming Liu, Qingsong Xie, Ruichen Wang, Zhihao LI, Yuqi Liang, Jianqi Bi, Jun Luo, Junfeng Yang, Can Li, Jing Fu, Hongwei Xu, Mingrui Long, Lulin Tang

A total of 211 participants have registered in the structure track.

Image Restoration Text to Image Generation +1

Improved Visual-Spatial Reasoning via R1-Zero-Like Training

1 code implementation1 Apr 2025 Zhenyi Liao, Qingsong Xie, Yanhao Zhang, Zijian Kong, Haonan Lu, Zhenyu Yang, Zhijie Deng

Increasing attention has been placed on improving the reasoning capacities of multi-modal large language models (MLLMs).

Spatial Reasoning

H2VU-Benchmark: A Comprehensive Benchmark for Hierarchical Holistic Video Understanding

no code implementations31 Mar 2025 Qi Wu, Quanlong Zheng, Yanhao Zhang, Junlin Xie, Jinguo Luo, Kuo Wang, Peng Liu, Qingsong Xie, Ru Zhen, Haonan Lu, Zhenyu Yang

To tackle this challenge, we propose a hierarchical and holistic video understanding (H2VU) benchmark designed to evaluate both general video and online streaming video comprehension.

Video Understanding

Layton: Latent Consistency Tokenizer for 1024-pixel Image Reconstruction and Generation by 256 Tokens

no code implementations11 Mar 2025 Qingsong Xie, Zhao Zhang, Zhe Huang, Yanhao Zhang, Haonan Lu, Zhenyu Yang

Experiments demonstrate Layton's superiority in high-fidelity reconstruction, with 10. 8 reconstruction Frechet Inception Distance on MSCOCO-2017 5K benchmark for 1024x1024 image reconstruction.

Decoder Image Reconstruction +3

FaceScore: Benchmarking and Enhancing Face Quality in Human Generation

1 code implementation24 Jun 2024 Zhenyi Liao, Qingsong Xie, Chen Chen, Hannan Lu, Zhijie Deng

Targeting addressing such an issue, we first assess the face quality of generations from popular pre-trained DMs with the aid of human annotators and then evaluate the alignment between existing metrics with human judgments.

Benchmarking Denoising +3

TLCM: Training-efficient Latent Consistency Model for Image Generation with 2-8 Steps

1 code implementation9 Jun 2024 Qingsong Xie, Zhenyi Liao, Zhijie Deng, Chen Chen, Haonan Lu

Distilling latent diffusion models (LDMs) into ones that are fast to sample from is attracting growing research interest.

Image Generation Style Transfer

LAPTOP-Diff: Layer Pruning and Normalized Distillation for Compressing Diffusion Models

no code implementations17 Apr 2024 Dingkun Zhang, Sijia Li, Chen Chen, Qingsong Xie, Haonan Lu

To this end, we proposed the layer pruning and normalized distillation for compressing diffusion models (LAPTOP-Diff).

Knowledge Distillation

SCott: Accelerating Diffusion Models with Stochastic Consistency Distillation

no code implementations3 Mar 2024 Hongjian Liu, Qingsong Xie, Zhijie Deng, Chen Chen, Shixiang Tang, Fueyang Fu, Zheng-Jun Zha, Haonan Lu

In contrast to vanilla consistency distillation (CD) which distills the ordinary differential equation solvers-based sampling process of a pretrained teacher model into a student, SCott explores the possibility and validates the efficacy of integrating stochastic differential equation (SDE) solvers into CD to fully unleash the potential of the teacher.

Text to Image Generation Text-to-Image Generation

Instruct-ReID: A Multi-purpose Person Re-identification Task with Instructions

1 code implementation CVPR 2024 Weizhen He, Yiheng Deng, Shixiang Tang, Qihao Chen, Qingsong Xie, Yizhou Wang, Lei Bai, Feng Zhu, Rui Zhao, Wanli Ouyang, Donglian Qi, Yunfeng Yan

This paper strives to resolve this problem by proposing a new instruct-ReID task that requires the model to retrieve images according to the given image or language instructions.

Person Re-Identification Triplet

HumanBench: Towards General Human-centric Perception with Projector Assisted Pretraining

1 code implementation CVPR 2023 Shixiang Tang, Cheng Chen, Qingsong Xie, Meilin Chen, Yizhou Wang, Yuanzheng Ci, Lei Bai, Feng Zhu, Haiyang Yang, Li Yi, Rui Zhao, Wanli Ouyang

Specifically, we propose a \textbf{HumanBench} based on existing datasets to comprehensively evaluate on the common ground the generalization abilities of different pretraining methods on 19 datasets from 6 diverse downstream tasks, including person ReID, pose estimation, human parsing, pedestrian attribute recognition, pedestrian detection, and crowd counting.

 Ranked #1 on Pedestrian Attribute Recognition on PA-100K (using extra training data)

Attribute Autonomous Driving +5

Cannot find the paper you are looking for? You can Submit a new open access paper.