Search Results for author: Yang Zhan

Found 4 papers, 4 papers with code

SkyEyeGPT: Unifying Remote Sensing Vision-Language Tasks via Instruction Tuning with Large Language Model

1 code implementation18 Jan 2024 Yang Zhan, Zhitong Xiong, Yuan Yuan

Specifically, after projecting RS visual features to the language domain via an alignment layer, they are fed jointly with task-specific instructions into an LLM-based RS decoder to predict answers for RS open-ended tasks.

Instruction Following Language Modelling +2

Mono3DVG: 3D Visual Grounding in Monocular Images

1 code implementation13 Dec 2023 Yang Zhan, Yuan Yuan, Zhitong Xiong

To foster this task, we propose Mono3DVG-TR, an end-to-end transformer-based network, which takes advantage of both the appearance and geometry information in text embeddings for multi-modal learning and 3D object localization.

Object Object Localization +1

Parameter-Efficient Transfer Learning for Remote Sensing Image-Text Retrieval

1 code implementation24 Aug 2023 Yuan Yuan, Yang Zhan, Zhitong Xiong

To address this issue, in this work, we investigate the parameter-efficient transfer learning (PETL) method to effectively and efficiently transfer visual-language knowledge from the natural domain to the RS domain on the image-text retrieval task.

Image-text matching Retrieval +2

Cannot find the paper you are looking for? You can Submit a new open access paper.