no code implementations • 17 Jan 2025 • Huiyun Cao, Yuan Shi, Bin Xia, Xiaoyu Jin, Wenming Yang
Specifically, DiffStereo first learns latent high-frequency representations (LHFR) of HQ images.
1 code implementation • 7 Jan 2025 • Yuechen Zhang, Yaoyang Liu, Bin Xia, Bohao Peng, Zexin Yan, Eric Lo, Jiaya Jia
We present Magic Mirror, a framework for generating identity-preserved videos with cinematic-level quality and dynamic motion.
no code implementations • 22 Dec 2024 • Bin Xia, Yuechen Zhang, Jingyao Li, Chengyao Wang, Yitong Wang, Xinglong Wu, Bei Yu, Jiaya Jia
We begin by analyzing existing frameworks and the requirements of downstream tasks, proposing a unified framework that integrates both T2I models and various editing tasks.
1 code implementation • 19 Aug 2024 • Zhengchao Huang, Bin Xia, Zicheng Lin, Zhun Mou, Wenming Yang, Jiaya Jia
The rapid advancement of deepfake technologies has sparked widespread public concern, particularly as face forgery poses a serious threat to public information security.
no code implementations • 25 Jul 2024 • Tian Hao, Changxin Shi, Qingqing Wu, Bin Xia, Yinghong Guo, Lianghui Ding, Feng Yang
Integrated sensing and communication (ISAC) emerges as an essential technique for overcoming spectrum congestion.
no code implementations • 25 Jul 2024 • Jintong Hu, Bin Xia, Bin Chen, Wenming Yang, Lei Zhang
Although these approaches have shown promising results, their performance is constrained by the limited representation ability of discrete latent codes in the encoded features.
no code implementations • 7 Jul 2024 • Tian Hao, Changxin Shi, Yinghong Guo, Bin Xia, Feng Yang
This paper investigates a fluid antenna (FA) enhanced integrated sensing and communication (ISAC) system consisting of a base station (BS), multiple single-antenna communication users, and one point target, where the BS is equipped with FAs to enhance both the communication and sensing performance.
no code implementations • 15 Jun 2024 • Tong Wu, Zhiyong Chen, Meixia Tao, Bin Xia, Wenjun Zhang
In the proposed method, the transmitter extracts semantic features for two users separately.
no code implementations • 19 Mar 2024 • Jintong Hu, Bin Xia, Bingchen Li, Wenming Yang
Deep learning-based denoiser has been the focus of recent development on image denoising.
1 code implementation • 18 Mar 2024 • Yuan Shi, Bin Xia, Xiaoyu Jin, Xing Wang, Tianyu Zhao, Xin Xia, Xuefeng Xiao, Wenming Yang
To address these challenges, we propose VmambaIR, which introduces State Space Models (SSMs) with linear complexity into comprehensive image restoration tasks.
no code implementations • 21 Jan 2024 • Xiaoyu Jin, Yuan Shi, Bin Xia, Wenming Yang
By employing a pretrained multi-modal large language model and a vision language model, we generate text descriptions and encode them as context embedding with degradation information for the degraded image.
1 code implementation • 26 Dec 2023 • Jingyao Li, Pengguang Chen, Bin Xia, Hong Xu, Jiaya Jia
Large Language Models (LLMs) have showcased impressive capabilities in handling straightforward programming tasks.
Ranked #3 on
Code Generation
on APPS
1 code implementation • 27 Nov 2023 • Bin Xia, Shiyin Wang, Yingfan Tao, Yitong Wang, Jiaya Jia
In the first stage, we train the MLLM to grasp the properties of image generation and editing, enabling it to generate detailed prompts.
no code implementations • 16 Nov 2023 • Yuan Shi, Bin Xia, Rui Zhu, Qingmin Liao, Wenming Yang
Color-guided depth map super-resolution (CDSR) improve the spatial resolution of a low-quality depth map with the corresponding high-quality color map, benefiting various applications such as 3D reconstruction, virtual reality, and augmented reality.
no code implementations • 26 Aug 2023 • Bin Xia, Yulun Zhang, Shiyin Wang, Yitong Wang, Xinglong Wu, Yapeng Tian, Wenming Yang, Radu Timotfe, Luc van Gool
Compared to traditional DMs, the compact IPR enables DiffI2I to obtain more accurate outcomes and employ a lighter denoising network and fewer iterations.
1 code implementation • NeurIPS 2023 • Zheng Chen, Yulun Zhang, Ding Liu, Bin Xia, Jinjin Gu, Linghe Kong, Xin Yuan
Specifically, we perform the DM in a highly compacted latent space to generate the prior feature for the deblurring process.
1 code implementation • ICCV 2023 • Bin Xia, Yulun Zhang, Shiyin Wang, Yitong Wang, Xinglong Wu, Yapeng Tian, Wenming Yang, Luc van Gool
Diffusion model (DM) has achieved SOTA performance by modeling the image synthesis process into a sequential application of a denoising network.
1 code implementation • 30 Jan 2023 • Chenhan Cao, Xiaoyu Fang, Bingqing Luo, Bin Xia
The recurrent recommendation network recommends the appropriate expert based on the assumption that the previous expert in the recommendation sequence cannot solve the expert.
1 code implementation • 28 Jan 2023 • Juan Wang, Bin Xia
The proposed polar transformation based MIL formulation works for both tight and loose bounding boxes, in which a positive bag is defined as pixels in a polar line of a bounding box with one endpoint located inside the object enclosed by the box and the other endpoint located at one of the four sides of the box.
1 code implementation • 30 Nov 2022 • Bin Xia, Yulun Zhang, Yitong Wang, Yapeng Tian, Wenming Yang, Radu Timofte, Luc van Gool
It consists of a knowledge distillation based implicit degradation estimator network (KD-IDE) and an efficient SR network.
2 code implementations • 2 Oct 2022 • Bin Xia, Yulun Zhang, Yitong Wang, Yapeng Tian, Wenming Yang, Radu Timofte, Luc van Gool
In this study, we reconsider components in binary convolution, such as residual connection, BatchNorm, activation function, and structure, for IR tasks.
1 code implementation • 28 Jul 2022 • Bin Xia, Yapeng Tian, Yulun Zhang, Yucheng Hang, Wenming Yang, Qingmin Liao
The most of CNN based super-resolution (SR) methods assume that the degradation is known (\eg, bicubic).
1 code implementation • CVPR 2023 • Bin Xia, Jingwen He, Yulun Zhang, Yitong Wang, Yapeng Tian, Wenming Yang, Luc van Gool
In SSL, we design pruning schemes for several key components in VSR models, including residual blocks, recurrent networks, and upsampling networks.
1 code implementation • CVPR 2022 • Yucheng Hang, Bin Xia, Wenming Yang, Qingmin Liao
In addition, we propose a background-attentional adaptive instance normalization (BAIN) to achieve an attention-weighted background feature distribution according to the foreground-background feature similarity.
1 code implementation • 3 Mar 2022 • Juan Wang, Bin Xia
It presents a multiple instance learning strategy based on polar transformation to assist image segmentation when loose bounding boxes are employed as supervision.
1 code implementation • 12 Jan 2022 • Bin Xia, Yapeng Tian, Yucheng Hang, Wenming Yang, Qingmin Liao, Jie zhou
To improve matching efficiency, we design a novel Embedded PatchMacth scheme with random samples propagation, which involves end-to-end training with asymptotic linear computational cost to the input size.
1 code implementation • 11 Jan 2022 • Bin Xia, Yucheng Hang, Yapeng Tian, Wenming Yang, Qingmin Liao, Jie zhou
To demonstrate the effectiveness of ENLCA, we build an architecture called Efficient Non-Local Contrastive Network (ENLCN) by adding a few of our modules in a simple backbone.
1 code implementation • 3 Oct 2021 • Juan Wang, Bin Xia
For this purpose, we develop a two-task network named as CDRNet for accurate CDR measurement, one for weakly supervised image segmentation, and the other for bounding-box regression.
1 code implementation • 3 Oct 2021 • Juan Wang, Bin Xia
Two variants of smooth maximum approximation, i. e., $\alpha$-softmax function and $\alpha$-quasimax function, are exploited to conquer the numeral instability introduced by maximum function of bag prediction.