Search Results for author: Zhibin Wang

Found 34 papers, 11 papers with code

Continuous-Multiple Image Outpainting in One-Step via Positional Query and A Diffusion-based Approach

1 code implementation28 Jan 2024 Shaofeng Zhang, Jinfa Huang, Qiang Zhou, Zhibin Wang, Fan Wang, Jiebo Luo, Junchi Yan

At inference, we generate images with arbitrary expansion multiples by inputting an anchor image and its corresponding positional embeddings.

Image Outpainting

Paint3D: Paint Anything 3D with Lighting-Less Texture Diffusion Models

1 code implementation21 Dec 2023 Xianfang Zeng, Xin Chen, Zhongqi Qi, Wen Liu, Zibo Zhao, Zhibin Wang, Bin Fu, Yong liu, Gang Yu

This paper presents Paint3D, a novel coarse-to-fine generative framework that is capable of producing high-resolution, lighting-less, and diverse 2K UV texture maps for untextured 3D meshes conditioned on text or image inputs.

2k

ChartLlama: A Multimodal LLM for Chart Understanding and Generation

no code implementations27 Nov 2023 Yucheng Han, Chi Zhang, Xin Chen, Xu Yang, Zhibin Wang, Gang Yu, Bin Fu, Hanwang Zhang

Next, we introduce ChartLlama, a multi-modal large language model that we've trained using our created dataset.

Language Modelling Large Language Model

InfMLLM: A Unified Framework for Visual-Language Tasks

2 code implementations12 Nov 2023 Qiang Zhou, Zhibin Wang, Wei Chu, Yinghui Xu, Hao Li, Yuan Qi

Our experiments demonstrate that preserving the positional information of visual embeddings through the pool-adapter is particularly beneficial for tasks like visual grounding.

Image Captioning Instruction Following +3

Robust Geometry-Preserving Depth Estimation Using Differentiable Rendering

no code implementations ICCV 2023 Chi Zhang, Wei Yin, Gang Yu, Zhibin Wang, Tao Chen, Bin Fu, Joey Tianyi Zhou, Chunhua Shen

In this paper, we propose a learning framework that trains models to predict geometry-preserving depth without requiring extra data or annotations.

Monocular Depth Estimation

ES-MVSNet: Efficient Framework for End-to-end Self-supervised Multi-View Stereo

no code implementations4 Aug 2023 Qiang Zhou, Chaohui Yu, Jingliang Li, Yuang Liu, Jing Wang, Zhibin Wang

to provide additional consistency constraints, which grows GPU memory consumption and complicates the model's structure and training pipeline.

Optical Flow Estimation Semantic Segmentation

Improved Neural Radiance Fields Using Pseudo-depth and Fusion

no code implementations27 Jul 2023 Jingliang Li, Qiang Zhou, Chaohui Yu, Zhengda Lu, Jun Xiao, Zhibin Wang, Fan Wang

To make the constructed volumes as close as possible to the surfaces of objects in the scene and the rendered depth more accurate, we propose to perform depth prediction and radiance field reconstruction simultaneously.

Depth Estimation Depth Prediction +1

Points-to-3D: Bridging the Gap between Sparse Points and Shape-Controllable Text-to-3D Generation

no code implementations26 Jul 2023 Chaohui Yu, Qiang Zhou, Jingliang Li, Zhe Zhang, Zhibin Wang, Fan Wang

To better utilize the sparse 3D points, we propose an efficient point cloud guidance loss to adaptively drive the NeRF's geometry to align with the shape of the sparse 3D points.

3D Generation Text to 3D

SwinRDM: Integrate SwinRNN with Diffusion Model towards High-Resolution and High-Quality Weather Forecasting

no code implementations5 Jun 2023 Lei Chen, Fei Du, Yuan Hu, Fan Wang, Zhibin Wang

Recurrent predictions for future atmospheric fields are firstly performed at 1. 40625-degree resolution, and then a diffusion-based super-resolution model is leveraged to recover the high spatial resolution and finer-scale atmospheric details.

Super-Resolution Weather Forecasting

UniNeXt: Exploring A Unified Architecture for Vision Recognition

1 code implementation26 Apr 2023 Fangjian Lin, Jianlong Yuan, Sitong Wu, Fan Wang, Zhibin Wang

Interestingly, the ranking of these spatial token mixers also changes under our UniNeXt, suggesting that an excellent spatial token mixer may be stifled due to a suboptimal general architecture, which further shows the importance of the study on the general architecture of vision backbone.

Spatial Token Mixer

D2Q-DETR: Decoupling and Dynamic Queries for Oriented Object Detection with Transformers

no code implementations1 Mar 2023 Qiang Zhou, Chaohui Yu, Zhibin Wang, Fan Wang

In this paper, we propose an end-to-end framework for oriented object detection, which simplifies the model pipeline and obtains superior performance.

Object object-detection +3

LMSeg: Language-guided Multi-dataset Segmentation

no code implementations27 Feb 2023 Qiang Zhou, Yuang Liu, Chaohui Yu, Jingliang Li, Zhibin Wang, Fan Wang

Instead of relabeling each dataset with the unified taxonomy, a category-guided decoding module is designed to dynamically guide predictions to each datasets taxonomy.

Image Augmentation Panoptic Segmentation +1

PolyBuilding: Polygon Transformer for End-to-End Building Extraction

no code implementations3 Nov 2022 Yuan Hu, Zhibin Wang, Zhou Huang, Yu Liu

Given a set of polygon queries, the model learns the relations among them and encodes context information from the image to predict the final set of building polygons with fixed vertex numbers.

Over-the-Air Computation: Foundations, Technologies, and Applications

no code implementations19 Oct 2022 Zhibin Wang, Yapeng Zhao, Yong Zhou, Yuanming Shi, Chunxiao Jiang, Khaled B. Letaief

The rapid advancement of artificial intelligence technologies has given rise to diversified intelligent services, which place unprecedented demands on massive connectivity and gigantic data aggregation.

MimCo: Masked Image Modeling Pre-training with Contrastive Teacher

no code implementations7 Sep 2022 Qiang Zhou, Chaohui Yu, Hao Luo, Zhibin Wang, Hao Li

Specifically, MimCo takes a pre-trained contrastive learning model as the teacher model and is pre-trained with two types of learning targets: patch-level and image-level reconstruction losses.

Contrastive Learning Self-Supervised Learning

FAKD: Feature Augmented Knowledge Distillation for Semantic Segmentation

1 code implementation30 Aug 2022 Jianlong Yuan, Qian Qi, Fei Du, Zhibin Wang, Fan Wang, Yifan Liu

Inspired by the recent progress on semantic directions on feature-space, we propose to include augmentations in feature space for efficient distillation.

Knowledge Distillation Segmentation +1

Semi-supervised Semantic Segmentation with Mutual Knowledge Distillation

1 code implementation24 Aug 2022 Jianlong Yuan, Jinchao Ge, Zhibin Wang, Yifan Liu

More specifically, we use the pseudo-labels generated by a mean teacher to supervise the student network to achieve a mutual knowledge distillation between the two branches.

Knowledge Distillation Pseudo Label +1

Semantic Data Augmentation based Distance Metric Learning for Domain Generalization

no code implementations2 Aug 2022 Mengzhu Wang, Jianlong Yuan, Qi Qian, Zhibin Wang, Hao Li

Further, we provide an in-depth analysis of the mechanism and rational behind our approach, which gives us a better understanding of why leverage logits in lieu of features can help domain generalization.

Data Augmentation Domain Generalization +1

Interference Management for Over-the-Air Federated Learning in Multi-Cell Wireless Networks

no code implementations6 Jun 2022 Zhibin Wang, Yong Zhou, Yuanming Shi, Weihua Zhuang

We characterize the Pareto boundary of the error-induced gap region to quantify the learning performance trade-off among different FL tasks, based on which we formulate an optimization problem to minimize the sum of error-induced gaps in all cells.

Federated Learning Management

Point RCNN: An Angle-Free Framework for Rotated Object Detection

no code implementations28 May 2022 Qiang Zhou, Chaohui Yu, Zhibin Wang, Hao Li

To tackle this problem, we propose a purely angle-free framework for rotated object detection, called Point RCNN, which mainly consists of PointRPN and PointReg.

Object object-detection +1

SwinVRNN: A Data-Driven Ensemble Forecasting Model via Learned Distribution Perturbation

no code implementations26 May 2022 Yuan Hu, Lei Chen, Zhibin Wang, Hao Li

We also compare four categories of perturbation methods for ensemble forecasting, i. e. fixed distribution perturbation, learned distribution perturbation, MC dropout, and multi model ensemble.

Weather Forecasting

A Simple Baseline for Semi-supervised Semantic Segmentation with Strong Data Augmentation

1 code implementation ICCV 2021 Jianlong Yuan, Yifan Liu, Chunhua Shen, Zhibin Wang, Hao Li

Previous works [3, 27] fail to employ strong augmentation in pseudo label learning efficiently, as the large distribution change caused by strong augmentation harms the batch normalisation statistics.

Data Augmentation Image Classification +3

TFPose: Direct Human Pose Estimation with Transformers

no code implementations29 Mar 2021 Weian Mao, Yongtao Ge, Chunhua Shen, Zhi Tian, Xinlong Wang, Zhibin Wang

We propose a human pose estimation framework that solves the task in the regression-based fashion.

Ranked #26 on Pose Estimation on MPII Human Pose (using extra training data)

Pose Estimation regression

Instant-Teaching: An End-to-End Semi-Supervised Object Detection Framework

1 code implementation CVPR 2021 Qiang Zhou, Chaohui Yu, Zhibin Wang, Qi Qian, Hao Li

To alleviate the confirmation bias problem and improve the quality of pseudo annotations, we further propose a co-rectify scheme based on Instant-Teaching, denoted as Instant-Teaching$^*$.

Ranked #12 on Semi-Supervised Object Detection on COCO 100% labeled data (using extra training data)

Object object-detection +2

Object Detection Made Simpler by Eliminating Heuristic NMS

no code implementations28 Jan 2021 Qiang Zhou, Chaohui Yu, Chunhua Shen, Zhibin Wang, Hao Li

On the COCO dataset, our simple design achieves superior performance compared to both the FCOS baseline detector with NMS post-processing and the recent end-to-end NMS-free detectors.

Object object-detection +1

Federated Learning via Intelligent Reflecting Surface

no code implementations10 Nov 2020 Zhibin Wang, Jiahang Qiu, Yong Zhou, Yuanming Shi, Liqun Fu, Wei Chen, Khaled B. Lataief

To optimize the learning performance, we formulate an optimization problem that jointly optimizes the device selection, the aggregation beamformer at the base station (BS), and the phase shifts at the IRS to maximize the number of devices participating in the model aggregation of each communication round under certain mean-squared-error (MSE) requirements.

Federated Learning

Cannot find the paper you are looking for? You can Submit a new open access paper.