Search Results for author: Yongjun Zhang

Found 36 papers, 15 papers with code

Hadamard Attention Recurrent Transformer: A Strong Baseline for Stereo Matching Transformer

1 code implementation2 Jan 2025 Ziyang Chen, Yongjun Zhang, Wenting Li, Bingshu Wang, Yabo Wu, Yong Zhao, C. L. Philip Chen

However, constrained by the low-rank bottleneck and quadratic complexity of attention mechanisms, stereo transformers still fail to demonstrate sufficient nonlinear expressiveness within a reasonable inference time.

Stereo Matching

Cross-View Image Set Geo-Localization

no code implementations25 Dec 2024 Qiong Wu, Panwang Xia, Lei Yu, Yi Liu, Mingtao Xiong, Liheng Zhong, Jingdong Chen, Ming Yang, Yongjun Zhang, Yi Wan

Therefore, we propose a novel task: Cross-View Image Set Geo-Localization (Set-CVGL), which gathers multiple images with diverse perspectives as a query set for localization.

geo-localization

Cross-View Geo-Localization with Street-View and VHR Satellite Imagery in Decentrality Settings

2 code implementations16 Dec 2024 Panwang Xia, Lei Yu, Yi Wan, Qiong Wu, Peiqi Chen, Liheng Zhong, Yongxiang Yao, Dong Wei, Xinyi Liu, Lixiang Ru, Yingying Zhang, Jiangwei Lao, Jingdong Chen, Ming Yang, Yongjun Zhang

To address this limitation, we introduce DReSS (Decentrality Related Street-view and Satellite-view dataset), a novel dataset designed to evaluate cross-view geo-localization with a large geographic scope and diverse landscapes, emphasizing the decentrality issue.

Disaster Response geo-localization +1

Motif Channel Opened in a White-Box: Stereo Matching via Motif Correlation Graph

1 code implementation19 Nov 2024 Ziyang Chen, Yongjun Zhang, Wenting Li, Bingshu Wang, Yong Zhao, C. L. Philip Chen

However, learning-based stereo matching methods inherently suffer from the loss of geometric structures in certain feature channels, creating a bottleneck in achieving precise detail matching.

Autonomous Driving Stereo Matching

Nighttime Pedestrian Detection Based on Fore-Background Contrast Learning

no code implementations6 Aug 2024 He Yao, Yongjun Zhang, Huachun Jian, Li Zhang, Ruzhong Cheng

The significance of background information is frequently overlooked in contemporary research concerning channel attention mechanisms.

Pedestrian Detection

MRIo3DS-Net: A Mutually Reinforcing Images to 3D Surface RNN-like framework for model-adaptation indoor 3D reconstruction

no code implementations16 Jul 2024 Chang Li, Jiao Guo, Yufei Zhao, Yongjun Zhang

This paper is the first to propose an end-to-end framework of mutually reinforcing images to 3D surface recurrent neural network-like for model-adaptation indoor 3D reconstruction, where multi-view dense matching and point cloud surface optimization are mutually reinforced by a RNN-like structure rather than being treated as a separate issue. The characteristics are as follows:In the multi-view dense matching module, the model-adaptation strategy is used to fine-tune and optimize a Transformer-based multi-view dense matching DNN, so that it has the higher image feature for matching and detail expression capabilities;In the point cloud surface optimization module, the 3D surface reconstruction network based on 3D implicit field is optimized by using model-adaptation strategy, which solves the problem of point cloud surface optimization without knowing normal vector of 3D surface. To improve and finely reconstruct 3D surfaces from point cloud, smooth loss is proposed and added to this module;The MRIo3DS-Net is a RNN-like framework, which utilizes the finely optimized 3D surface obtained by PCSOM to recursively reinforce the differentiable warping for optimizing MVDMM. This refinement leads to achieving better dense matching results, and better dense matching results leads to achieving better 3D surface results recursively and mutually. Hence, model-adaptation strategy can better collaborate the differences between the two network modules, so that they complement each other to achieve the better effect;To accelerate the transfer learning and training convergence from source domain to target domain, a multi-task loss function based on Bayesian uncertainty is used to adaptively adjust the weights between the two networks loss functions of MVDMM and PCSOM;In this multi-task cascade network framework, any modules can be replaced by any state-of-the-art networks to achieve better 3D reconstruction results.

3D Reconstruction Surface Reconstruction +1

UDHF2-Net: Uncertainty-diffusion-model-based High-Frequency TransFormer Network for Remotely Sensed Imagery Interpretation

no code implementations23 Jun 2024 Pengfei Zhang, Chang Li, Yongjun Zhang, Rongjun Qin

Besides the aforementioned spectrum noises in semantic segmentation, MUDM is also a self-supervised learning strategy to effectively reduce the edge false change detection from the generated imagery with geometric registration error.

Change Detection Self-Supervised Learning +1

SkySenseGPT: A Fine-Grained Instruction Tuning Dataset and Model for Remote Sensing Vision-Language Understanding

1 code implementation14 Jun 2024 Junwei Luo, Zhen Pang, Yongjun Zhang, Tingzhu Wang, LinLin Wang, Bo Dang, Jiangwei Lao, Jian Wang, Jingdong Chen, Yihua Tan, Yansheng Li

Remote Sensing Large Multi-Modal Models (RSLMMs) are developing rapidly and showcase significant capabilities in remote sensing imagery (RSI) comprehension.

Graph Generation Relation +1

STAR: A First-Ever Dataset and A Large-Scale Benchmark for Scene Graph Generation in Large-Size Satellite Imagery

3 code implementations13 Jun 2024 Yansheng Li, LinLin Wang, Tingzhu Wang, Xue Yang, Junwei Luo, Qi Wang, Youming Deng, Wenbin Wang, Xian Sun, Haifeng Li, Bo Dang, Yongjun Zhang, Yi Yu, Junchi Yan

This paper constructs a large-scale dataset for SGG in large-size VHR SAI with image sizes ranging from 512 x 768 to 27, 860 x 31, 096 pixels, named STAR (Scene graph generaTion in lArge-size satellite imageRy), encompassing over 210K objects and over 400K triplets.

Graph Generation Object +3

Bridging Data Islands: Geographic Heterogeneity-Aware Federated Learning for Collaborative Remote Sensing Semantic Segmentation

no code implementations14 Apr 2024 Jieyi Tan, Yansheng Li, Sergey A. Bartalev, Shinkarenko Stanislav, Bo Dang, Yongjun Zhang, Liangqi Yuan, Wei Chen

Our framework consists of three modules, including the Global Insight Enhancement (GIE) module, the Essential Feature Mining (EFM) module and the Local-Global Balance (LoGo) module.

Earth Observation Federated Learning +2

AUG: A New Dataset and An Efficient Model for Aerial Image Urban Scene Graph Generation

no code implementations11 Apr 2024 Yansheng Li, Kun Li, Yongjun Zhang, LinLin Wang, Dingwen Zhang

To fill in the gap of the overhead view dataset, this paper constructs and releases an aerial image urban scene graph generation (AUG) dataset.

Graph Generation Relationship Detection +1

MoCha-Stereo: Motif Channel Attention Network for Stereo Matching

1 code implementation CVPR 2024 Ziyang Chen, Wei Long, He Yao, Yongjun Zhang, Bingshu Wang, Yongbin Qin, Jia Wu

In addition, edge variations in %potential feature channels of the reconstruction error map also affect details matching, we propose the Reconstruction Error Motif Penalty (REMP) module to further refine the full-resolution disparity estimation.

Disparity Estimation Stereo Depth Estimation +1

Adaptive Convolutional Neural Network for Image Super-resolution

2 code implementations24 Feb 2024 Chunwei Tian, Xuanyu Zhang, Tao Wang, Yongjun Zhang, Qi Zhu, Chia-Wen Lin

The lower network utilizes a symmetric architecture to enhance relations of different layers to mine more structural information, which is complementary with a upper network for image super-resolution.

Image Super-Resolution Relation

SpirDet: Towards Efficient, Accurate and Lightweight Infrared Small Target Detector

no code implementations8 Feb 2024 Qianchen Mao, Qiang Li, Bingshu Wang, Yongjun Zhang, Tao Dai, C. L. Philip Chen

To tackle this challenge, we propose SpirDet, a novel approach for efficient detection of infrared small targets.

Decoder

Learning to Holistically Detect Bridges from Large-Size VHR Remote Sensing Imagery

no code implementations5 Dec 2023 Yansheng Li, Junwei Luo, Yongjun Zhang, Yihua Tan, Jin-Gang Yu, Song Bai

Therefore, to ensure the visibility and integrity of bridges, it is essential to perform holistic bridge detection in large-size very-high-resolution (VHR) RSIs.

object-detection Object Detection

LLVMs4Protest: Harnessing the Power of Large Language and Vision Models for Deciphering Protests in the News

1 code implementation30 Nov 2023 Yongjun Zhang

First, the longformer model was fine-tuned using the Dynamic of Collective Action (DoCA) Corpus.

Imbalance Knowledge-Driven Multi-modal Network for Land-Cover Semantic Segmentation Using Images and LiDAR Point Clouds

no code implementations28 Mar 2023 Yameng Wang, Yi Wan, Yongjun Zhang, Bin Zhang, Zhi Gao

The present multi-modal methods usually map high-dimensional features to low-dimensional spaces as a preprocess before feature extraction to address the nonnegligible domain gap, which inevitably leads to information loss.

Semantic Segmentation

GLH-Water: A Large-Scale Dataset for Global Surface Water Detection in Large-Size Very-High-Resolution Satellite Imagery

no code implementations16 Mar 2023 Yansheng Li, Bo Dang, Wanchun Li, Yongjun Zhang

Global surface water detection in very-high-resolution (VHR) satellite imagery can directly serve major applications such as refined flood mapping and water resource assessment.

Semantic Segmentation

RIFT2: Speeding-up RIFT with A New Rotation-Invariance Technique

1 code implementation1 Mar 2023 Jiayuan Li, Pengcheng Shi, Qingwu Hu, Yongjun Zhang

Multimodal image matching is an important prerequisite for multisource image information fusion.

High-Frequency Stereo Matching Network

no code implementations CVPR 2023 Haoliang Zhao, Huizhou Zhou, Yongjun Zhang, Jie Chen, Yitong Yang, Yong Zhao

In the field of binocular stereo matching, remarkable progress has been made by iterative methods like RAFT-Stereo and CREStereo.

Stereo Matching

Progressive Learning with Cross-Window Consistency for Semi-Supervised Semantic Segmentation

no code implementations22 Nov 2022 Bo Dang, Yansheng Li, Yongjun Zhang, Jiayi Ma

Semi-supervised semantic segmentation focuses on the exploration of a small amount of labeled data and a large amount of unlabeled data, which is more in line with the demands of real-world image understanding applications.

Pseudo Label Semi-Supervised Semantic Segmentation

EHSNet: End-to-End Holistic Learning Network for Large-Size Remote Sensing Image Semantic Segmentation

no code implementations21 Nov 2022 Wei Chen, Yansheng Li, Bo Dang, Yongjun Zhang

This paper presents EHSNet, a new end-to-end segmentation network designed for the holistic learning of large-size remote sensing image semantic segmentation (LRISS).

Semantic Segmentation

Hierarchical Memory Learning for Fine-Grained Scene Graph Generation

no code implementations14 Mar 2022 Youming Deng, Yansheng Li, Yongjun Zhang, Xiang Xiang, Jian Wang, Jingdong Chen, Jiayi Ma

After the autonomous partition of coarse and fine predicates, the model is first trained on the coarse predicates and then learns the fine predicates.

Graph Generation Scene Graph Generation

LiDAR-guided Stereo Matching with a Spatial Consistency Constraint

no code implementations21 Feb 2022 Yongjun Zhang, Siyuan Zou, Xinyi Liu, Xu Huang, Yi Wan, Yongxiang Yao

Next, we propose a riverbed enhancement function to optimize the cost volume of the LiDAR projection points and their homogeneous pixels to improve the matching robustness.

Stereo Matching

Asymmetric Hash Code Learning for Remote Sensing Image Retrieval

1 code implementation15 Jan 2022 Weiwei Song, Zhi Gao, Renwei Dian, Pedram Ghamisi, Yongjun Zhang, Jón Atli Benediktsson

In this paper, we propose a novel deep hashing method, named asymmetric hash code learning (AHCL), for RSIR.

Deep Hashing Image Retrieval

RMNA: A Neighbor Aggregation-Based Knowledge Graph Representation Learning Model Using Rule Mining

1 code implementation1 Nov 2021 Ling Chen, Jun Cui, Xing Tang, Chaodu Song, Yuntao Qian, Yansheng Li, Yongjun Zhang

Therefore, neighbor aggregation-based representation learning (NARL) models are proposed, which encode the information in the neighbors of an entity into its embeddings.

Graph Representation Learning Knowledge Graph Completion

Group-Aware Graph Neural Network for Nationwide City Air Quality Forecasting

1 code implementation27 Aug 2021 Ling Chen, Jiahui Xu, Binqing Wu, Yuntao Qian, Zhenhong Du, Yansheng Li, Yongjun Zhang

The model constructs a city graph and a city group graph to model the spatial and latent dependencies between cities, respectively.

graph construction Graph Neural Network

Collaboratively boosting data-driven deep learning and knowledge-guided ontological reasoning for semantic segmentation of remote sensing imagery

no code implementations6 Oct 2020 Yansheng Li, Song Ouyang, Yongjun Zhang

As one kind of architecture from the deep learning family, deep semantic segmentation network (DSSN) achieves a certain degree of success on the semantic segmentation task and obviously outperforms the traditional methods based on hand-crafted features.

Segmentation Of Remote Sensing Imagery Semantic Segmentation

FA-Harris: A Fast and Asynchronous Corner Detector for Event Cameras

no code implementations26 Jun 2019 Ruoxiang Li, Dianxi Shi, Yongjun Zhang, Kaiyue Li, Ruihao Li

The proposed G-SAE maintenance algorithm and corner candidate selection algorithm greatly enhance the real-time performance for corner detection, while the corner candidate refinement algorithm maintains the accuracy of performance by using an improved event-based Harris detector.

Cannot find the paper you are looking for? You can Submit a new open access paper.