Search Results for author: Zhibin Wang

Found 34 papers, 11 papers with code

Continuous-Multiple Image Outpainting in One-Step via Positional Query and A Diffusion-based Approach

1 code implementation • 28 Jan 2024 • Shaofeng Zhang, Jinfa Huang, Qiang Zhou, Zhibin Wang, Fan Wang, Jiebo Luo, Junchi Yan

At inference, we generate images with arbitrary expansion multiples by inputting an anchor image and its corresponding positional embeddings.

Image Outpainting

Paper
Code

Paint3D: Paint Anything 3D with Lighting-Less Texture Diffusion Models

1 code implementation • 21 Dec 2023 • Xianfang Zeng, Xin Chen, Zhongqi Qi, Wen Liu, Zibo Zhao, Zhibin Wang, Bin Fu, Yong liu, Gang Yu

This paper presents Paint3D, a novel coarse-to-fine generative framework that is capable of producing high-resolution, lighting-less, and diverse 2K UV texture maps for untextured 3D meshes conditioned on text or image inputs.

505

Paper
Code

ChartLlama: A Multimodal LLM for Chart Understanding and Generation

no code implementations • 27 Nov 2023 • Yucheng Han, Chi Zhang, Xin Chen, Xu Yang, Zhibin Wang, Gang Yu, Bin Fu, Hanwang Zhang

Next, we introduce ChartLlama, a multi-modal large language model that we've trained using our created dataset.

Language Modelling Large Language Model

Paper
Add Code

InfMLLM: A Unified Framework for Visual-Language Tasks

2 code implementations • 12 Nov 2023 • Qiang Zhou, Zhibin Wang, Wei Chu, Yinghui Xu, Hao Li, Yuan Qi

Our experiments demonstrate that preserving the positional information of visual embeddings through the pool-adapter is particularly beneficial for tasks like visual grounding.

Ranked #62 on Visual Question Answering on MM-Vet

Image Captioning Instruction Following +3

Paper
Code

Robust Geometry-Preserving Depth Estimation Using Differentiable Rendering

no code implementations • ICCV 2023 • Chi Zhang, Wei Yin, Gang Yu, Zhibin Wang, Tao Chen, Bin Fu, Joey Tianyi Zhou, Chunhua Shen

In this paper, we propose a learning framework that trains models to predict geometry-preserving depth without requiring extra data or annotations.

Monocular Depth Estimation

Paper
Add Code

StableLLaVA: Enhanced Visual Instruction Tuning with Synthesized Image-Dialogue Data

1 code implementation • 20 Aug 2023 • Yanda Li, Chi Zhang, Gang Yu, Zhibin Wang, Bin Fu, Guosheng Lin, Chunhua Shen, Ling Chen, Yunchao Wei

However, these datasets often exhibit domain bias, potentially constraining the generative capabilities of the models.

Ranked #53 on Visual Question Answering on MM-Vet

Visual Question Answering

Paper
Code

ICPC: Instance-Conditioned Prompting with Contrastive Learning for Semantic Segmentation

no code implementations • 14 Aug 2023 • Chaohui Yu, Qiang Zhou, Zhibin Wang, Fan Wang

Second, we propose an align-guided contrastive loss to refine the alignment of vision and text embeddings.

Contrastive Learning Semantic Segmentation

Paper
Add Code

ES-MVSNet: Efficient Framework for End-to-end Self-supervised Multi-View Stereo

no code implementations • 4 Aug 2023 • Qiang Zhou, Chaohui Yu, Jingliang Li, Yuang Liu, Jing Wang, Zhibin Wang

to provide additional consistency constraints, which grows GPU memory consumption and complicates the model's structure and training pipeline.

Optical Flow Estimation Semantic Segmentation

Paper
Add Code

Improved Neural Radiance Fields Using Pseudo-depth and Fusion

no code implementations • 27 Jul 2023 • Jingliang Li, Qiang Zhou, Chaohui Yu, Zhengda Lu, Jun Xiao, Zhibin Wang, Fan Wang

To make the constructed volumes as close as possible to the surfaces of objects in the scene and the rendered depth more accurate, we propose to perform depth prediction and radiance field reconstruction simultaneously.

Depth Estimation Depth Prediction +1

Paper
Add Code

Points-to-3D: Bridging the Gap between Sparse Points and Shape-Controllable Text-to-3D Generation

no code implementations • 26 Jul 2023 • Chaohui Yu, Qiang Zhou, Jingliang Li, Zhe Zhang, Zhibin Wang, Fan Wang

To better utilize the sparse 3D points, we propose an efficient point cloud guidance loss to adaptively drive the NeRF's geometry to align with the shape of the sparse 3D points.

3D Generation Text to 3D

Paper
Add Code

SwinRDM: Integrate SwinRNN with Diffusion Model towards High-Resolution and High-Quality Weather Forecasting

no code implementations • 5 Jun 2023 • Lei Chen, Fei Du, Yuan Hu, Fan Wang, Zhibin Wang

Recurrent predictions for future atmospheric fields are firstly performed at 1. 40625-degree resolution, and then a diffusion-based super-resolution model is leveraged to recover the high spatial resolution and finer-scale atmospheric details.

Super-Resolution Weather Forecasting

Paper
Add Code

UniNeXt: Exploring A Unified Architecture for Vision Recognition

1 code implementation • 26 Apr 2023 • Fangjian Lin, Jianlong Yuan, Sitong Wu, Fan Wang, Zhibin Wang

Interestingly, the ranking of these spatial token mixers also changes under our UniNeXt, suggesting that an excellent spatial token mixer may be stifled due to a suboptimal general architecture, which further shows the importance of the study on the general architecture of vision backbone.

Spatial Token Mixer

Paper
Code

D2Q-DETR: Decoupling and Dynamic Queries for Oriented Object Detection with Transformers

no code implementations • 1 Mar 2023 • Qiang Zhou, Chaohui Yu, Zhibin Wang, Fan Wang

In this paper, we propose an end-to-end framework for oriented object detection, which simplifies the model pipeline and obtains superior performance.

Object object-detection +3

Paper
Add Code

Foundation Model Drives Weakly Incremental Learning for Semantic Segmentation

no code implementations • CVPR 2023 • Chaohui Yu, Qiang Zhou, Jingliang Li, Jianlong Yuan, Zhibin Wang, Fan Wang

In this work, we propose a novel and data-efficient framework for WILSS, named FMWISS.

Incremental Learning Segmentation +1

Paper
Add Code

LMSeg: Language-guided Multi-dataset Segmentation

no code implementations • 27 Feb 2023 • Qiang Zhou, Yuang Liu, Chaohui Yu, Jingliang Li, Zhibin Wang, Fan Wang

Instead of relabeling each dataset with the unified taxonomy, a category-guided decoding module is designed to dynamically guide predictions to each datasets taxonomy.

Image Augmentation Panoptic Segmentation +1

Paper
Add Code

Efficient Mask Correction for Click-Based Interactive Image Segmentation

1 code implementation • CVPR 2023 • Fei Du, Jianlong Yuan, Zhibin Wang, Fan Wang

To this end, we propose an efficient method to correct the mask with a lightweight mask correction network.

Image Segmentation Segmentation +3

Paper
Code

PolyBuilding: Polygon Transformer for End-to-End Building Extraction

no code implementations • 3 Nov 2022 • Yuan Hu, Zhibin Wang, Zhou Huang, Yu Liu

Given a set of polygon queries, the model learns the relations among them and encodes context information from the image to predict the final set of building polygons with fixed vertex numbers.

Paper
Add Code

Over-the-Air Computation: Foundations, Technologies, and Applications

no code implementations • 19 Oct 2022 • Zhibin Wang, Yapeng Zhao, Yong Zhou, Yuanming Shi, Chunxiao Jiang, Khaled B. Letaief

The rapid advancement of artificial intelligence technologies has given rise to diversified intelligent services, which place unprecedented demands on massive connectivity and gigantic data aggregation.

Paper
Add Code

Hierarchical Normalization for Robust Monocular Depth Estimation

no code implementations • 18 Oct 2022 • Chi Zhang, Wei Yin, Zhibin Wang, Gang Yu, Bin Fu, Chunhua Shen

In this paper, we address monocular depth estimation with deep neural networks.

Monocular Depth Estimation

Paper
Add Code

MimCo: Masked Image Modeling Pre-training with Contrastive Teacher

no code implementations • 7 Sep 2022 • Qiang Zhou, Chaohui Yu, Hao Luo, Zhibin Wang, Hao Li

Specifically, MimCo takes a pre-trained contrastive learning model as the teacher model and is pre-trained with two types of learning targets: patch-level and image-level reconstruction losses.

Contrastive Learning Self-Supervised Learning

Paper
Add Code

FAKD: Feature Augmented Knowledge Distillation for Semantic Segmentation

1 code implementation • 30 Aug 2022 • Jianlong Yuan, Qian Qi, Fei Du, Zhibin Wang, Fan Wang, Yifan Liu

Inspired by the recent progress on semantic directions on feature-space, we propose to include augmentations in feature space for efficient distillation.

Knowledge Distillation Segmentation +1

Paper
Code

Semi-supervised Semantic Segmentation with Mutual Knowledge Distillation

1 code implementation • 24 Aug 2022 • Jianlong Yuan, Jinchao Ge, Zhibin Wang, Yifan Liu

More specifically, we use the pseudo-labels generated by a mean teacher to supervise the student network to achieve a mutual knowledge distillation between the two branches.

Knowledge Distillation Pseudo Label +1

Paper
Code

Semantic Data Augmentation based Distance Metric Learning for Domain Generalization

no code implementations • 2 Aug 2022 • Mengzhu Wang, Jianlong Yuan, Qi Qian, Zhibin Wang, Hao Li

Further, we provide an in-depth analysis of the mechanism and rational behind our approach, which gives us a better understanding of why leverage logits in lieu of features can help domain generalization.

Data Augmentation Domain Generalization +1

Paper
Add Code

Interference Management for Over-the-Air Federated Learning in Multi-Cell Wireless Networks

no code implementations • 6 Jun 2022 • Zhibin Wang, Yong Zhou, Yuanming Shi, Weihua Zhuang

We characterize the Pareto boundary of the error-induced gap region to quantify the learning performance trade-off among different FL tasks, based on which we formulate an optimization problem to minimize the sum of error-induced gaps in all cells.

Federated Learning Management

Paper
Add Code

Point-Teaching: Weakly Semi-Supervised Object Detection with Point Annotations

no code implementations • 1 Jun 2022 • Yongtao Ge, Qiang Zhou, Xinlong Wang, Zhibin Wang, Hao Li, Chunhua Shen

Point annotations are considerably more time-efficient than bounding box annotations.

Data Augmentation Multiple Instance Learning +4

Paper
Add Code

Point RCNN: An Angle-Free Framework for Rotated Object Detection

no code implementations • 28 May 2022 • Qiang Zhou, Chaohui Yu, Zhibin Wang, Hao Li

To tackle this problem, we propose a purely angle-free framework for rotated object detection, called Point RCNN, which mainly consists of PointRPN and PointReg.

Object object-detection +1

Paper
Add Code

SwinVRNN: A Data-Driven Ensemble Forecasting Model via Learned Distribution Perturbation

no code implementations • 26 May 2022 • Yuan Hu, Lei Chen, Zhibin Wang, Hao Li

We also compare four categories of perturbation methods for ensemble forecasting, i. e. fixed distribution perturbation, learned distribution perturbation, MC dropout, and multi model ensemble.

Weather Forecasting

Paper
Add Code

Poseur: Direct Human Pose Regression with Transformers

1 code implementation • 19 Jan 2022 • Weian Mao, Yongtao Ge, Chunhua Shen, Zhi Tian, Xinlong Wang, Zhibin Wang, Anton Van Den Hengel

We propose a direct, regression-based approach to 2D human pose estimation from single images.

Ranked #2 on Keypoint Detection on MS COCO

2D Human Pose Estimation Keypoint Detection +1

168

Paper
Code

Get Better 1 Pixel PCK: Ladder Scales Correspondence Flow Networks for Remote Sensing Image Matching in Higher Resolution

no code implementations • ICCV 2021 • Weitao Chen, Zhibin Wang, Hao Li

Percentage of image size is often used as the threshold of PCK.

Paper
Add Code

A Simple Baseline for Semi-supervised Semantic Segmentation with Strong Data Augmentation

1 code implementation • ICCV 2021 • Jianlong Yuan, Yifan Liu, Chunhua Shen, Zhibin Wang, Hao Li

Previous works [3, 27] fail to employ strong augmentation in pseudo label learning efficiently, as the large distribution change caused by strong augmentation harms the batch normalisation statistics.

Ranked #12 on Semi-Supervised Semantic Segmentation on Cityscapes 25% labeled

Data Augmentation Image Classification +3

Paper
Code

TFPose: Direct Human Pose Estimation with Transformers

no code implementations • 29 Mar 2021 • Weian Mao, Yongtao Ge, Chunhua Shen, Zhi Tian, Xinlong Wang, Zhibin Wang

We propose a human pose estimation framework that solves the task in the regression-based fashion.

Ranked #26 on Pose Estimation on MPII Human Pose (using extra training data)

Pose Estimation regression

Paper
Add Code

Instant-Teaching: An End-to-End Semi-Supervised Object Detection Framework

1 code implementation • CVPR 2021 • Qiang Zhou, Chaohui Yu, Zhibin Wang, Qi Qian, Hao Li

To alleviate the confirmation bias problem and improve the quality of pseudo annotations, we further propose a co-rectify scheme based on Instant-Teaching, denoted as Instant-Teaching$^*$.

Ranked #12 on Semi-Supervised Object Detection on COCO 100% labeled data (using extra training data)

Object object-detection +2

Paper
Code

Object Detection Made Simpler by Eliminating Heuristic NMS

no code implementations • 28 Jan 2021 • Qiang Zhou, Chaohui Yu, Chunhua Shen, Zhibin Wang, Hao Li

On the COCO dataset, our simple design achieves superior performance compared to both the FCOS baseline detector with NMS post-processing and the recent end-to-end NMS-free detectors.

Object object-detection +1

Paper
Add Code

Federated Learning via Intelligent Reflecting Surface

no code implementations • 10 Nov 2020 • Zhibin Wang, Jiahang Qiu, Yong Zhou, Yuanming Shi, Liqun Fu, Wei Chen, Khaled B. Lataief

To optimize the learning performance, we formulate an optimization problem that jointly optimizes the device selection, the aggregation beamformer at the base station (BS), and the phase shifts at the IRS to maximize the number of devices participating in the model aggregation of each communication round under certain mean-squared-error (MSE) requirements.

Federated Learning

Paper
Add Code

Cannot find the paper you are looking for? You can Submit a new open access paper.