Search Results for author: Zhitong Xiong

Found 24 papers, 12 papers with code

ChatEarthNet: A Global-Scale Image-Text Dataset Empowering Vision-Language Geo-Foundation Models

no code implementations17 Feb 2024 Zhenghang Yuan, Zhitong Xiong, Lichao Mou, Xiao Xiang Zhu

In this context, we introduce a global-scale, high-quality image-text dataset for remote sensing, providing natural language descriptions for Sentinel-2 data to facilitate the understanding of satellite imagery for common users.

Semantic Segmentation

Efficient Subseasonal Weather Forecast using Teleconnection-informed Transformers

no code implementations31 Jan 2024 Shan Zhao, Zhitong Xiong, Xiao Xiang Zhu

Subseasonal forecasting, which is pivotal for agriculture, water resource management, and early warning of disasters, faces challenges due to the chaotic nature of the atmosphere.

Weather Forecasting

SkyEyeGPT: Unifying Remote Sensing Vision-Language Tasks via Instruction Tuning with Large Language Model

1 code implementation18 Jan 2024 Yang Zhan, Zhitong Xiong, Yuan Yuan

Specifically, after projecting RS visual features to the language domain via an alignment layer, they are fed jointly with task-specific instructions into an LLM-based RS decoder to predict answers for RS open-ended tasks.

Instruction Following Language Modelling +2

One for All: Toward Unified Foundation Models for Earth Vision

no code implementations15 Jan 2024 Zhitong Xiong, Yi Wang, Fahong Zhang, Xiao Xiang Zhu

Current remote sensing foundation models typically specialize in a single modality or a specific spatial resolution range, limiting their versatility for downstream datasets.

Mono3DVG: 3D Visual Grounding in Monocular Images

1 code implementation13 Dec 2023 Yang Zhan, Yuan Yuan, Zhitong Xiong

To foster this task, we propose Mono3DVG-TR, an end-to-end transformer-based network, which takes advantage of both the appearance and geometry information in text embeddings for multi-modal learning and 3D object localization.

Object Object Localization +1

HTC-DC Net: Monocular Height Estimation from Single Remote Sensing Images

1 code implementation28 Sep 2023 Sining Chen, Yilei Shi, Zhitong Xiong, Xiao Xiang Zhu

To tackle this problem, we propose a method for monocular height estimation from optical imagery, which is currently one of the richest sources of remote sensing data.


Few-shot Object Detection in Remote Sensing: Lifting the Curse of Incompletely Annotated Novel Objects

1 code implementation19 Sep 2023 Fahong Zhang, Yilei Shi, Zhitong Xiong, Xiao Xiang Zhu

In this context, few-shot object detection (FSOD) has emerged as a promising direction, which aims at enabling the model to detect novel objects with only few of them annotated.

Few-Shot Object Detection object-detection +1

Exploring Geometric Deep Learning For Precipitation Nowcasting

no code implementations11 Sep 2023 Shan Zhao, Sudipan Saha, Zhitong Xiong, Niklas Boers, Xiao Xiang Zhu

Motivated by this, we explore a geometric deep learning-based temporal Graph Convolutional Network (GCN) for precipitation nowcasting.

Parameter-Efficient Transfer Learning for Remote Sensing Image-Text Retrieval

1 code implementation24 Aug 2023 Yuan Yuan, Yang Zhan, Zhitong Xiong

To address this issue, in this work, we investigate the parameter-efficient transfer learning (PETL) method to effectively and efficiently transfer visual-language knowledge from the natural domain to the RS domain on the image-text retrieval task.

Image-text matching Retrieval +2

GAMUS: A Geometry-aware Multi-modal Semantic Segmentation Benchmark for Remote Sensing Data

1 code implementation24 May 2023 Zhitong Xiong, Sining Chen, Yi Wang, Lichao Mou, Xiao Xiang Zhu

Towards a fair and comprehensive analysis of existing methods, the proposed benchmark consists of 1) a large-scale dataset including co-registered RGB and nDSM pairs and pixel-wise semantic labels; 2) a comprehensive evaluation and analysis of existing multi-modal fusion strategies for both convolutional and Transformer-based networks on remote sensing data.

Segmentation Semantic Segmentation

EarthNets: Empowering AI in Earth Observation

no code implementations10 Oct 2022 Zhitong Xiong, Fahong Zhang, Yi Wang, Yilei Shi, Xiao Xiang Zhu

Furthermore, a new platform for Earth observation, termed EarthNets, is released as a means of achieving a fair and consistent evaluation of deep learning methods on remote sensing data.

Scene Understanding Weather Forecasting

Doubly Deformable Aggregation of Covariance Matrices for Few-shot Segmentation

1 code implementation30 Jul 2022 Zhitong Xiong, Haopeng Li, Xiao Xiang Zhu

To address this problem, we propose to aggregate the learnable covariance matrices with a deformable 4D Transformer to effectively predict the segmentation map.

Few-Shot Semantic Segmentation Segmentation +2

Disentangled Latent Transformer for Interpretable Monocular Height Estimation

1 code implementation17 Jan 2022 Zhitong Xiong, Sining Chen, Yilei Shi, Xiao Xiang Zhu

Furthermore, a novel unsupervised semantic segmentation task based on height estimation is first introduced in this work.

Unsupervised Semantic Segmentation

THE Benchmark: Transferable Representation Learning for Monocular Height Estimation

no code implementations30 Dec 2021 Zhitong Xiong, Wei Huang, Jingtao Hu, Xiao Xiang Zhu

Therefore, we propose a new benchmark dataset to study the transferability of height estimation models in a cross-dataset setting.

Representation Learning Transfer Learning

Change Detection Meets Visual Question Answering

1 code implementation12 Dec 2021 Zhenghang Yuan, Lichao Mou, Zhitong Xiong, Xiaoxiang Zhu

In order to provide every user with flexible access to change information and help them better understand land-cover changes, we introduce a novel task: change detection-based visual question answering (CDVQA) on multi-temporal aerial images.

Answer Generation Change Detection +3

ASK: Adaptively Selecting Key Local Features for RGB-D Scene Recognition

no code implementations14 Oct 2021 Zhitong Xiong, Yuan Yuan, Qi Wang

Discriminative local theme-level and object-level representations can be selected with the DLFS module from the spatially-correlated multi-modal RGB-D features.

feature selection Scene Classification +1

CM-Net: Concentric Mask based Arbitrary-Shaped Text Detection

no code implementations30 Nov 2020 Chuang Yang, Mulin Chen, Zhitong Xiong, Yuan Yuan, Qi Wang

Extensive experiments demonstrate the proposed CM is efficient and robust to fit arbitrary-shaped text instances, and also validate the effectiveness of MPF and constraints loss for discriminative text features recognition.

Text Detection

Variational Context-Deformable ConvNets for Indoor Scene Parsing

no code implementations CVPR 2020 Zhitong Xiong, Yuan Yuan, Nianhui Guo, Qi Wang

The main contributions of this work are as follows: 1) a novel VCD module is proposed, which exploits learnable Gaussian kernels to enable feature learning with structured adaptive-context; 2) variational Bayesian probabilistic modeling is introduced for the training of VCD module, which can make it continuous and more stable; 3) a perspective-aware guidance module is designed to take advantage of multi-modal information for RGB-D segmentation.

Scene Parsing Segmentation +1

VSSA-NET: Vertical Spatial Sequence Attention Network for Traffic Sign Detection

no code implementations5 May 2019 Yuan Yuan, Zhitong Xiong, Student Member, Qi. Wang, Senior Member, IEEE

Our contributions are as follows: 1) We propose a multi-resolution feature fusion network architecture which exploits densely connected deconvolution layers with skip connections, and can learn more effective features for the small size object; 2) We frame the traffic sign detection as a spatial sequence classification and regression task, and propose a vertical spatial sequence attention (VSSA) module to gain more context information for better detection performance.

object-detection Object Detection +1

Cannot find the paper you are looking for? You can Submit a new open access paper.