Search Results for author: Xian Sun

Found 56 papers, 16 papers with code

Multistage Fusion with Forget Gate for Multimodal Summarization in Open-Domain Videos

no code implementations EMNLP 2020 Nayu Liu, Xian Sun, Hongfeng Yu, Wenkai Zhang, Guangluan Xu

Multimodal summarization for open-domain videos is an emerging task, aiming to generate a summary from multisource information (video, audio, transcript).

Decoder

RS-DFM: A Remote Sensing Distributed Foundation Model for Diverse Downstream Tasks

no code implementations11 Jun 2024 Zhechao Wang, Peirui Cheng, Pengju Tian, Yuchao Wang, Mingxin Chen, Shujing Duan, Zhirui Wang, Xinming Li, Xian Sun

To overcome this limitation, we propose a Remote Sensing Distributed Foundation Model (RS-DFM) based on generalized information mapping and interaction.

3D Object Detection Depth Estimation +4

ReCon1M:A Large-scale Benchmark Dataset for Relation Comprehension in Remote Sensing Imagery

no code implementations10 Jun 2024 Xian Sun, Qiwei Yan, Chubo Deng, Chenglong Liu, Yi Jiang, Zhongyan Hou, Wanxuan Lu, Fanglong Yao, Xiaoyu Liu, Lingxiang Hao, Hongfeng Yu

Scene Graph Generation (SGG) is a high-level visual understanding and reasoning task aimed at extracting entities (such as objects) and their interrelationships from images.

Graph Generation object-detection +3

UCDNet: Multi-UAV Collaborative 3D Object Detection Network by Reliable Feature Mapping

no code implementations7 Jun 2024 Pengju Tian, Peirui Cheng, Yuchao Wang, Zhechao Wang, Zhirui Wang, Menglong Yan, Xue Yang, Xian Sun

Multi-UAV collaborative 3D object detection can perceive and comprehend complex environments by integrating complementary information, with applications encompassing traffic monitoring, delivery services and agricultural management.

3D Object Detection Management +2

Twin Deformable Point Convolutions for Point Cloud Semantic Segmentation in Remote Sensing Scenes

no code implementations30 May 2024 Yong-Qiang Mao, Hanbo Bi, Xuexue Li, Kaiqiang Chen, Zhirui Wang, Xian Sun, Kun fu

Thanks to the application of deep learning technology in point cloud processing of the remote sensing field, point cloud segmentation has become a research hotspot in recent years, which can be applied to real-world 3D, smart cities, and other fields.

Point Cloud Segmentation Segmentation +1

SDL-MVS: View Space and Depth Deformable Learning Paradigm for Multi-View Stereo Reconstruction in Remote Sensing

no code implementations27 May 2024 Yong-Qiang Mao, Hanbo Bi, Liangyu Xu, Kaiqiang Chen, Zhirui Wang, Xian Sun, Kun fu

To solve the above problem, we re-examine the deformable learning method in the Multi-View Stereo task and propose a novel paradigm based on view Space and Depth deformable Learning (SDL-MVS), aiming to learn deformable interactions of features in different view spaces and deformably model the depth ranges and intervals to enable high accurate depth estimation.

3D Reconstruction Depth Estimation

TAFormer: A Unified Target-Aware Transformer for Video and Motion Joint Prediction in Aerial Scenes

no code implementations27 Mar 2024 Liangyu Xu, Wanxuan Lu, Hongfeng Yu, Yongqiang Mao, Hanbo Bi, Chenglong Liu, Xian Sun, Kun fu

To address this issue, we introduce a novel task called Target-Aware Aerial Video Prediction, aiming to simultaneously predict future scenes and motion states of the target.

Disaster Response Object Tracking +1

SFTformer: A Spatial-Frequency-Temporal Correlation-Decoupling Transformer for Radar Echo Extrapolation

no code implementations28 Feb 2024 Liangyu Xu, Wanxuan Lu, Hongfeng Yu, Fanglong Yao, Xian Sun, Kun fu

The model leverages stacked multiple SFT-Blocks to not only mine the correlation of the spatiotemporal dynamics of echo cells but also avoid the mutual interference between the temporal modeling and the spatial morphology refinement by decoupling them.

Not Just Learning from Others but Relying on Yourself: A New Perspective on Few-Shot Segmentation in Remote Sensing

1 code implementation19 Oct 2023 Hanbo Bi, Yingchao Feng, Zhiyuan Yan, Yongqiang Mao, Wenhui Diao, Hongqi Wang, Xian Sun

In addition, to prevent the co-existence of multiple classes in remote sensing scenes from exacerbating the collapse of FSS generalization, we also propose a new Known-class Meta Suppressor (KMS) module to suppress the activation of known-class objects in the sample.

Image Segmentation Semantic Segmentation

RingMo-lite: A Remote Sensing Multi-task Lightweight Network with CNN-Transformer Hybrid Framework

no code implementations16 Sep 2023 Yuelei Wang, Ting Zhang, Liangjin Zhao, Lin Hu, Zhechao Wang, Ziqing Niu, Peirui Cheng, Kaiqiang Chen, Xuan Zeng, Zhirui Wang, Hongqi Wang, Xian Sun

It is combined by the Transformer module as a low-pass filter to extract global features of RS images through a dual-branch structure, and the CNN module as a stacked high-pass filter to extract fine-grained details effectively.

Thinking Like an Expert:Multimodal Hypergraph-of-Thought (HoT) Reasoning to boost Foundation Modals

no code implementations11 Aug 2023 Fanglong Yao, Changyuan Tian, Jintao Liu, Zequn Zhang, Qing Liu, Li Jin, Shuchao Li, Xiaoyu Li, Xian Sun

Inspired by this, this paper innovatively proposes a multimodal Hypergraph-of-Thought (HoT) reasoning paradigm, which enables the foundation models to possess the expert-level ability of high-order multi-hop reasoning and multimodal comparative judgement.

Graph Learning Logical Reasoning

OGMN: Occlusion-guided Multi-task Network for Object Detection in UAV Images

no code implementations24 Apr 2023 Xuexue Li, Wenhui Diao, Yongqiang Mao, Peng Gao, Xiuhua Mao, Xinming Li, Xian Sun

One interaction for the guide is between two task decoders to address the feature confusion problem, and an occlusion decoupling head (ODH) is proposed to replace the general detection head.

object-detection Object Detection +1

LIGHT: Joint Individual Building Extraction and Height Estimation from Satellite Images through a Unified Multitask Learning Network

no code implementations3 Apr 2023 Yongqiang Mao, Xian Sun, Xingliang Huang, Kaiqiang Chen

Building extraction and height estimation are two important basic tasks in remote sensing image interpretation, which are widely used in urban planning, real-world 3D construction, and other fields.

Instance Segmentation Semantic Segmentation

SiamTHN: Siamese Target Highlight Network for Visual Tracking

no code implementations22 Mar 2023 Jiahao Bao, Kaiqiang Chen, Xian Sun, Liangjin Zhao, Wenhui Diao, Menglong Yan

The majority of siamese network based trackers now in use treat each channel in the feature maps generated by the backbone network equally, making the similarity response map sensitive to background influence and hence challenging to focus on the target region.

regression Visual Object Tracking +1

TOT: Topology-Aware Optimal Transport For Multimodal Hate Detection

no code implementations27 Feb 2023 Linhao Zhang, Li Jin, Xian Sun, Guangluan Xu, Zequn Zhang, Xiaoyu Li, Nayu Liu, Qing Liu, Shiyao Yan

Multimodal hate detection, which aims to identify harmful content online such as memes, is crucial for building a wholesome internet environment.

Elevation Estimation-Driven Building 3D Reconstruction from Single-View Remote Sensing Imagery

no code implementations11 Jan 2023 Yongqiang Mao, Kaiqiang Chen, Liangjin Zhao, Wei Chen, Deke Tang, Wenjie Liu, Zhirui Wang, Wenhui Diao, Xian Sun, Kun fu

Our Building3D is rooted in the SFFDE network for building elevation prediction, synchronized with a building extraction network for building masks, and then sequentially performs point cloud reconstruction, surface reconstruction (or CityGML model reconstruction).

Point cloud reconstruction Surface Reconstruction

Beyond the Limitation of Monocular 3D Detector via Knowledge Distillation

no code implementations ICCV 2023 Yiran Yang, Dongshuo Yin, Xuee Rong, Xian Sun, Wenhui Diao, Xinming Li

Moreover, we construct a depth-guided matrix by the predicted depth gap of teacher and student to facilitate the model to learn more knowledge of farther objects in prediction level distillation.

Knowledge Distillation

1% VS 100%: Parameter-Efficient Low Rank Adapter for Dense Predictions

no code implementations CVPR 2023 Dongshuo Yin, Yiran Yang, Zhechao Wang, Hongfeng Yu, Kaiwen Wei, Xian Sun

Fine-tuning large-scale pre-trained vision models to downstream tasks is a standard technique for achieving state-of-the-art performance on computer vision benchmarks.

Instance Segmentation object-detection +3

Probabilistic Deep Metric Learning for Hyperspectral Image Classification

1 code implementation15 Nov 2022 Chengkun Wang, Wenzhao Zheng, Xian Sun, Jiwen Lu, Jie zhou

We propose to learn a global probabilistic distribution for each pixel in the patch and a probabilistic metric to model the distance between distributions.

Classification Hyperspectral Image Classification +1

Learning to Evaluate Performance of Multi-modal Semantic Localization

1 code implementation14 Sep 2022 Zhiqiang Yuan, Wenkai Zhang, Chongyang Li, Zhaoying Pan, Yongqiang Mao, Jialiang Chen, Shouke Li, Hongqi Wang, Xian Sun

Finally, we analyze the SeLo performance of RS cross-modal retrieval models in detail, explore the impact of different variables on this task, and provide a complete benchmark for the SeLo task.

Cross-Modal Retrieval Referring Expression +2

Beyond single receptive field: A receptive field fusion-and-stratification network for airborne laser scanning point cloud classification

1 code implementation21 Jul 2022 Yongqiang Mao, Kaiqiang Chen, Wenhui Diao, Xian Sun, Xiaonan Lu, Kun fu, Martin Weinmann

With receptive field fusion-and-stratification, RFFS-Net is more adaptable to the classification of regions with complex structures and extreme scale variations in large-scale ALS point clouds.

Classification Point Cloud Classification

Learning Invariant Visual Representations for Compositional Zero-Shot Learning

1 code implementation1 Jun 2022 Tian Zhang, Kongming Liang, Ruoyi Du, Xian Sun, Zhanyu Ma, Jun Guo

Compositional Zero-Shot Learning (CZSL) aims to recognize novel compositions using knowledge learned from seen attribute-object compositions in the training set.

Attribute Compositional Zero-Shot Learning +2

A Span-level Bidirectional Network for Aspect Sentiment Triplet Extraction

1 code implementation27 Apr 2022 Yuqi Chen, Keming Chen, Xian Sun, Zequn Zhang

Aspect Sentiment Triplet Extraction (ASTE) is a new fine-grained sentiment analysis task that aims to extract triplets of aspect terms, sentiments, and opinion terms from review sentences.

Aspect Sentiment Triplet Extraction Decoder

Remote Sensing Cross-Modal Text-Image Retrieval Based on Global and Local Information

1 code implementation21 Apr 2022 Zhiqiang Yuan, Wenkai Zhang, Changyuan Tian, Xuee Rong, Zhengyuan Zhang, Hongqi Wang, Kun fu, Xian Sun

In this article, we first propose a novel RSCTIR framework based on global and local information (GaLR), and design a multi-level information dynamic fusion (MIDF) module to efficaciously integrate features of different levels.

Cross-Modal Retrieval Image Retrieval +1

Semantic Segmentation for Point Cloud Scenes via Dilated Graph Feature Aggregation and Pyramid Decoders

no code implementations11 Apr 2022 Yongqiang Mao, Xian Sun, Kaiqiang Chen, Wenhui Diao, Zonghao Guo, Xiaonan Lu, Kun fu

Due to the unicity of receptive field, semantic segmentation of point clouds remains challenging for the expression of multi-receptive field features, which brings about the misclassification of instances with similar spatial structures.

Segmentation Semantic Segmentation

Optical Flow Training under Limited Label Budget via Active Learning

1 code implementation9 Mar 2022 Shuai Yuan, Xian Sun, Hannah Kim, Shuzhi Yu, Carlo Tomasi

Supervised training of optical flow predictors generally yields better accuracy than unsupervised training.

Active Learning Optical Flow Estimation

Learning by Active Forgetting for Neural Networks

no code implementations21 Nov 2021 Jian Peng, Xian Sun, Min Deng, Chao Tao, Bo Tang, Wenbo Li, Guohua Wu, QingZhu, Yu Liu, Tao Lin, Haifeng Li

This paper presents a learning model by active forgetting mechanism with artificial neural networks.

1213Li at SemEval-2021 Task 6: Detection of Propaganda with Multi-modal Attention and Pre-trained Models

no code implementations SEMEVAL 2021 Peiguang Li, Xuan Li, Xian Sun

This paper presents the solution proposed by the 1213Li team for subtask 3 in SemEval-2021 Task 6: identifying the multiple persuasion techniques used in the multi-modal content of the meme.

Double Similarity Distillation for Semantic Image Segmentation

no code implementations19 Jul 2021 Yingchao Feng, Xian Sun, Wenhui Diao, Jihao Li, Xin Gao

In this paper, motivated by the residual learning and global aggregation, we propose a simple yet general and effective knowledge distillation framework called double similarity distillation (DSD) to improve the classification accuracy of all existing compact networks by capturing the similarity knowledge in pixel and category dimensions, respectively.

Image Segmentation Knowledge Distillation +2

Cross-layer Navigation Convolutional Neural Network for Fine-grained Visual Classification

no code implementations21 Jun 2021 Chenyu Guo, Jiyang Xie, Kongming Liang, Xian Sun, Zhanyu Ma

Then, attention mechanisms are used after feature fusion to extract spatial and channel information while linking the high-level semantic information and the low-level texture features, which can better locate the discriminative regions for the FGVC.

Fine-Grained Image Classification

HYPER^2: Hyperbolic Poincare Embedding for Hyper-Relational Link Prediction

no code implementations20 Apr 2021 Shiyao Yan, Zequn Zhang, Xian Sun, Guangluan Xu, Li Jin, Shuchao Li

Link Prediction, addressing the issue of completing KGs with missing facts, has been broadly studied.

Link Prediction

FAIR1M: A Benchmark Dataset for Fine-grained Object Recognition in High-Resolution Remote Sensing Imagery

no code implementations9 Mar 2021 Xian Sun, Peijin Wang, Zhiyuan Yan, Feng Xu, Ruiping Wang, Wenhui Diao, Jin Chen, Jihao Li, Yingchao Feng, Tao Xu, Martin Weinmann, Stefan Hinz, Cheng Wang, Kun fu

In this paper, we propose a novel benchmark dataset with more than 1 million instances and more than 15, 000 images for Fine-grAined object recognItion in high-Resolution remote sensing imagery which is named as FAIR1M.

Object object-detection +2

AF-EMS Detector: Improve the Multi-Scale Detection Per- formance of the Anchor-Free Detector

no code implementations Remote Sensing 2021 Jiangqiao Yan, Liangjin Zhao, Wenhui Diao, Hongqi Wang, Xian Sun

With the objects to be detected becoming more complex, the problem of multi-scale object detection has attracted more and more attention, especially in the field of remote sensing detection.

Object object-detection +3

High Quality Remote Sensing Image Super-Resolution Using Deep Memory Connected Network

no code implementations1 Oct 2020 Wenjia Xu, Guangluan Xu, Yang Wang, Xian Sun, Daoyu Lin, Yirong Wu

Single image super-resolution is an effective way to enhance the spatial resolution of remote sensing image, which is crucial for many applications such as target detection and image classification.

Image Classification Image Super-Resolution

A Novel Training Protocol for Performance Predictors of Evolutionary Neural Architecture Search Algorithms

no code implementations30 Aug 2020 Yanan Sun, Xian Sun, Yuhan Fang, Gary Yen

Performance predictors are a type of regression models which can assist to accomplish the search, while without exerting much computational resource.

Neural Architecture Search regression

Hybrid Multiple Attention Network for Semantic Segmentation in Aerial Images

no code implementations9 Jan 2020 Ruigang Niu, Xian Sun, Yu Tian, Wenhui Diao, Kaiqiang Chen, Kun fu

Semantic segmentation in very high resolution (VHR) aerial images is one of the most challenging tasks in remote sensing image understanding.

Semantic Segmentation

Oriented Objects as pairs of Middle Lines

no code implementations23 Dec 2019 Hao-Ran Wei, Yue Zhang, Zhonghan Chang, Hao Li, Hongqi Wang, Xian Sun

It is noteworthy that the objects in COCO can be regard as a special form of oriented objects with an angle of 90 degrees.

object-detection Object Detection In Aerial Images +4

Ship Instance Segmentation From Remote Sensing Images Using Sequence Local Context Module

no code implementations22 Apr 2019 Yingchao Feng, Wenhui Diao, Zhonghan Chang, Menglong Yan, Xian Sun, Xin Gao

The performance of object instance segmentation in remote sensing images has been greatly improved through the introduction of many landmark frameworks based on convolutional neural network.

Instance Segmentation Segmentation +1

Comparison Network for One-Shot Conditional Object Detection

no code implementations4 Apr 2019 Tengfei Zhang, Yue Zhang, Xian Sun, Hao Sun, Menglong Yan, Xue Yang, Kun fu

A two-stage detector for OSCD is introduced to compare the extracted query and target features with the learnable metric to approach the optimized non-linear conditional probability.

Object object-detection +1

A Remote Sensing Image Dataset for Cloud Removal

2 code implementations3 Jan 2019 Daoyu Lin, Guangluan Xu, Xiaoke Wang, Yang Wang, Xian Sun, Kun fu

Removing clouds is an indispensable pre-processing step in remote sensing image analysis.

Change Detection Cloud Removal +1

Wider Channel Attention Network for Remote Sensing Image Super-resolution

no code implementations13 Dec 2018 Jun Gu, Guangluan Xu, Yue Zhang, Xian Sun, Ran Wen, Lei Wang

In this letter, we propose a novel single-image super-resolution (SISR) algorithm named Wider Channel Attention Network (WCAN) for remote sensing images.

Image Super-Resolution

Position Detection and Direction Prediction for Arbitrary-Oriented Ships via Multitask Rotation Region Convolutional Neural Network

3 code implementations13 Jun 2018 Xue Yang, Hao Sun, Xian Sun, Menglong Yan, Zhi Guo, Kun fu

The complexity of application scenarios, the redundancy of detection region, and the difficulty of dense ship detection are all the main obstacles that limit the successful operation of traditional methods in ship detection.

Position

Automatic Ship Detection of Remote Sensing Images from Google Earth in Complex Scenes Based on Multi-Scale Rotation Dense Feature Pyramid Networks

4 code implementations12 Jun 2018 Xue Yang, Hao Sun, Kun fu, Jirui Yang, Xian Sun, Menglong Yan, Zhi Guo

Additionally, in the case of ship rotation and dense arrangement, we design a rotation anchor strategy to predict the minimum circumscribed rectangle of the object so as to reduce the redundant detection region and improve the recall.

object-detection Object Detection

MARTA GANs: Unsupervised Representation Learning for Remote Sensing Image Classification

no code implementations28 Dec 2016 Daoyu Lin, Kun fu, Yang Wang, Guangluan Xu, Xian Sun

With the development of deep learning, supervised learning has frequently been adopted to classify remotely sensed images using convolutional networks (CNNs).

Classification General Classification +3

Cannot find the paper you are looking for? You can Submit a new open access paper.