Search Results for author: XueWei Li

Found 14 papers, 5 papers with code

LayoutDiffusion: Controllable Diffusion Model for Layout-to-image Generation

2 code implementations • CVPR 2023 • Guangcong Zheng, Xianpan Zhou, XueWei Li, Zhongang Qi, Ying Shan, Xi Li

To overcome the difficult multimodal fusion of image and layout, we propose to construct a structural image patch with region information and transform the patched image into a special layout to fuse with the normal layout in a unified form.

Ranked #1 on Layout-to-Image Generation on Visual Genome 128x128

Layout-to-Image Generation Object

222

Paper
Code

Bridging Cross-task Protocol Inconsistency for Distillation in Dense Object Detection

1 code implementation • ICCV 2023 • Longrong Yang, Xianpan Zhou, XueWei Li, Liang Qiao, Zheyang Li, Ziwei Yang, Gaoang Wang, Xi Li

Thus, the optimum of the distillation loss does not necessarily lead to the optimal student classification scores for dense object detectors.

Binary Classification Classification +4

Paper
Code

SGAT4PASS: Spherical Geometry-Aware Transformer for PAnoramic Semantic Segmentation

1 code implementation • 6 Jun 2023 • XueWei Li, Tao Wu, Zhongang Qi, Gaoang Wang, Ying Shan, Xi Li

Experimental results on Stanford2D3D Panoramic datasets show that SGAT4PASS significantly improves performance and robustness, with approximately a 2% increase in mIoU, and when small 3D disturbances occur in the data, the stability of our performance is improved by an order of magnitude.

Ranked #4 on Semantic Segmentation on Stanford2D3D Panoramic

Semantic Segmentation

Paper
Code

Epoch-evolving Gaussian Process Guided Learning

1 code implementation • 25 Jun 2020 • Jiabao Cui, XueWei Li, Bin Li, Hanbin Zhao, Bourahla Omar, Xi Li

In this paper, we propose a novel learning scheme called epoch-evolving Gaussian Process Guided Learning (GPGL), which aims at characterizing the correlation information between the batch-level distribution and the global data distribution.

Paper
Code

Windformer:Bi-Directional Long-Distance Spatio-Temporal Network For Wind Speed Prediction

1 code implementation • 24 Nov 2023 • XueWei Li, Zewen Shang, Zhiqiang Liu, Jian Yu, Wei Xiong, Mei Yu

History and future time information includes the trend of airflow changes, whether this dynamic information can be utilized will also affect the prediction effect.

Management Time Series

Paper
Code

Scene Learning: Deep Convolutional Networks For Wind Power Prediction by Embedding Turbines into Grid Space

no code implementations • 16 Jul 2018 • Ruiguo Yu, Zhi-Qiang Liu, Xuewei Li, Wenhuan Lu, Mei Yu, Jianrong Wang, Bin Li

There have been a lot of researches based on the time series of the wind power or speed, but In fact, these time series cannot express the temporal and spatial changes of wind, which fundamentally hinders the advance of wind power prediction.

Time Series Time Series Analysis

Paper
Add Code

ResKD: Residual-Guided Knowledge Distillation

no code implementations • 8 Jun 2020 • Xuewei Li, Songyuan Li, Bourahla Omar, Fei Wu, Xi Li

In this paper, we see knowledge distillation in a fresh light, using the knowledge gap, or the residual, between a teacher and a student as guidance to train a much more lightweight student, called a res-student.

Knowledge Distillation

Paper
Add Code

Self-Supervised Joint Learning Framework of Depth Estimation via Implicit Cues

no code implementations • 17 Jun 2020 • Jianrong Wang, Ge Zhang, Zhen-Yu Wu, XueWei Li, Li Liu

Compared with static views, abundant dynamic properties between video frames are beneficial to refined depth estimation, especially for dynamic objects.

Monocular Depth Estimation

Paper
Add Code

Residual-guided Personalized Speech Synthesis based on Face Image

no code implementations • 1 Apr 2022 • Jianrong Wang, Zixuan Wang, Xiaosheng Hu, XueWei Li, Qiang Fang, Li Liu

Experimental results show that the speech synthesized by our model is comparable to the personalized speech synthesized by training a large amount of audio data in previous works.

Speech Synthesis

Paper
Add Code

Self-paced Multi-grained Cross-modal Interaction Modeling for Referring Expression Comprehension

no code implementations • 21 Apr 2022 • Peihan Miao, Wei Su, Gaoang Wang, XueWei Li, Xi Li

As an important and challenging problem in vision-language tasks, referring expression comprehension (REC) generally requires a large amount of multi-grained information of visual and linguistic modalities to realize accurate reasoning.

Informativeness Referring Expression +1

Paper
Add Code

IDPL: Intra-subdomain adaptation adversarial learning segmentation method based on Dynamic Pseudo Labels

no code implementations • 7 Oct 2022 • XueWei Li, Weilun Zhang, Jie Gao, Xuzhou Fu, Jian Yu

Secondly, the subdomain classifier module based on instance confidence is constructed, which can dynamically divide the target domain into easy and difficult subdomains according to the relative proportion of easy and difficult instances.

Pseudo Label Segmentation +3

Paper
Add Code

Emotional Reaction Intensity Estimation Based on Multimodal Data

no code implementations • 16 Mar 2023 • Shangfei Wang, Jiaqiang Wu, Feiyi Zheng, Xin Li, XueWei Li, Suwen Wang, Yi Wu, Yanan Chang, Xiangyu Miao

In this paper, 1. better features are extracted with the SOTA pretrained models.

Paper
Add Code

Ultrasound Image Segmentation of Thyroid Nodule via Latent Semantic Feature Co-Registration

no code implementations • 13 Oct 2023 • XueWei Li, Yaqiao Zhu, Jie Gao, Xi Wei, Ruixuan Zhang, Yuan Tian, Zhiqiang Liu

Segmentation of nodules in thyroid ultrasound imaging plays a crucial role in the detection and treatment of thyroid cancer.

Image Segmentation Medical Image Segmentation +2

Paper
Add Code

SphereDiffusion: Spherical Geometry-Aware Distortion Resilient Diffusion Model

no code implementations • 15 Mar 2024 • Tao Wu, XueWei Li, Zhongang Qi, Di Hu, Xintao Wang, Ying Shan, Xi Li

Controllable spherical panoramic image generation holds substantial applicative potential across a variety of domains. However, it remains a challenging task due to the inherent spherical distortion and geometry characteristics, resulting in low-quality content generation. In this paper, we introduce a novel framework of SphereDiffusion to address these unique challenges, for better generating high-quality and precisely controllable spherical panoramic images. For the spherical distortion characteristic, we embed the semantics of the distorted object with text encoding, then explicitly construct the relationship with text-object correspondence to better use the pre-trained knowledge of the planar images. Meanwhile, we employ a deformable technique to mitigate the semantic deviation in latent space caused by spherical distortion. For the spherical geometry characteristic, in virtue of spherical rotation invariance, we improve the data diversity and optimization objectives in the training process, enabling the model to better learn the spherical geometry characteristic. Furthermore, we enhance the denoising process of the diffusion model, enabling it to effectively use the learned geometric characteristic to ensure the boundary continuity of the generated images. With these specific techniques, experiments on Structured3D dataset show that SphereDiffusion significantly improves the quality of controllable spherical image generation and relatively reduces around 35% FID on average.

Denoising Image Generation

Paper
Add Code

Cannot find the paper you are looking for? You can Submit a new open access paper.