no code implementations • 17 Dec 2024 • Xinlong Cheng, Tiantian Cao, Guoan Cheng, BangXuan Huang, Xinghan Tian, Ye Wang, Xiaoyu He, Weixin Li, Tianfan Xue, Xuan Dong
In this work, we address the limitations of denoising diffusion models (DDMs) in image restoration tasks, particularly the shape and color distortions that can compromise image quality.
no code implementations • 12 Dec 2024 • Chenyu Yang, Xuan Dong, Xizhou Zhu, Weijie Su, Jiahao Wang, Hao Tian, Zhe Chen, Wenhai Wang, Lewei Lu, Jifeng Dai
To this end, we extend each image into a "static" video and introduce a unified token compression strategy called Progressive Visual Token Compression (PVC), where the tokens of each frame are progressively encoded and adaptively compressed to supplement the information not extracted from previous frames.
1 code implementation • 11 Jun 2024 • Chenyu Yang, Xizhou Zhu, Jinguo Zhu, Weijie Su, Junjie Wang, Xuan Dong, Wenhai Wang, Lewei Lu, Bin Li, Jie zhou, Yu Qiao, Jifeng Dai
Recently, vision model pre-training has evolved from relying on manually annotated datasets to leveraging large-scale, web-crawled image-text data.
no code implementations • 16 Apr 2024 • Chunli Peng, Xuan Dong, Tiantian Cao, Zhengqing Li, Kun Dong, Weixin Li
The fusion of images from dual camera systems featuring a wide-angle and a telephoto camera has become a hotspot problem recently.
no code implementations • 18 Dec 2023 • Tiantian Cao, Xuan Dong, Chunli Peng, Zhengqing Li, Xinyu Guo, Weixin Li
Our insight is to minimize the occlusion area and thus maximize the use of pixels from $\bf{T}$ images.
1 code implementation • journal 2023 • Kang Xu, Weixin Li, Xia Wang, Xiaoyan Hu, Ke Yan, Xiaojie Wang, Xuan Dong
Based on the prior that, for each pixel, its similar pixels are usually spatially close, our insights are that (1) we partition the image into non-overlapped windows and perform regional self-attention to reduce the search range of each pixel, and (2) we encourage pixels across different windows to communicate with each other.
no code implementations • AAAI Technical Track on Computer Vision I 2021 • Xuan Dong, Xiaoyan Hu, Weixin Li, Xiaojie Wang;Yunhong Wang
In most of the related HDR imaging methods, the problem is usually solved by Multiple Images Merging, i. e. the final HDR image is fused from pixels of all the input LDR images.
no code implementations • 31 Jul 2020 • Xuan Dong, Donald S. Williamson
The real-world capabilities of objective speech quality measures are limited since current measures (1) are developed from simulated data that does not adequately model real environments; or they (2) predict objective scores that are not always strongly correlated with subjective ratings.
1 code implementation • AAAI Technical Track: Vision 2020 • Xuan Dong, Weixin Li, Xiaojie Wang, Yunhong Wang
We present a new CNN model, named cycle CNN, which can directly use the real data from monochrome-color camera systems for training.
2 code implementations • Proceedings of the AAAI Conference on Artificial Intelligence 2019 • Xuan Dong, Weixin Li, Xiaojie Wang, Yunhong Wang
To get high-quality color images, it is desired to colorize the gray image with the color image as reference.
no code implementations • NeurIPS 2018 • Chenfei Wu, Jinlai Liu, Xiaojie Wang, Xuan Dong
A chain of reasoning (CoR) is constructed for supporting multi-step and dynamic reasoning on changed relations and objects.
no code implementations • 21 Nov 2015 • Xuan Dong, Boyan Bonev, Weixin Li, Weichao Qiu, Xianjie Chen, Alan Yuille
Base-detail separation is a fundamental computer vision problem consisting of modeling a smooth base layer with the coarse structures, and a detail layer containing the texture-like structures.
no code implementations • 21 Nov 2015 • Xuan Dong, Yu Zhu, Weixin Li, Lingxi Xie, Alex Wong, Alan Yuille
In this paper, we proposed to use both fidelity (the difference with original images) and naturalness (human visual perception of super resolved images) for evaluation.
no code implementations • CVPR 2015 • Xuan Dong, Boyan Bonev, Yu Zhu, Alan L. Yuille
We study the problem of temporally consistent video post-processing.