Search Results for author: Ming Dai

Found 8 papers, 6 papers with code

Multi-task Visual Grounding with Coarse-to-Fine Consistency Constraints

1 code implementation12 Jan 2025 Ming Dai, Jian Li, Jiedong Zhuang, Xian Zhang, Wankou Yang

Furthermore, to address the challenge of insufficient multimodal understanding, we leverage pre-trained models based on visual-linguistic fusion representations.

Image Segmentation Referring Expression +3

A Survey on Benchmarks of Multimodal Large Language Models

1 code implementation16 Aug 2024 Jian Li, Weiheng Lu, Hao Fei, Meng Luo, Ming Dai, Min Xia, Yizhang Jin, Zhenye Gan, Ding Qi, Chaoyou Fu, Ying Tai, Wankou Yang, Yabiao Wang, Chengjie Wang

Multimodal Large Language Models (MLLMs) are gaining increasing popularity in both academia and industry due to their remarkable performance in various applications such as visual question answering, visual perception, understanding, and reasoning.

Question Answering Survey +1

OS-FPI: A Coarse-to-Fine One-Stream Network for UAV Geo-Localization

no code implementations10 Mar 2024 Jiahao Chen, Enhui Zheng, Ming Dai, Yifu Chen, Yusheng Lu

Furthermore, its performance in meter-level localization accuracy is impressive, with 182. 62% improvement in 3-meter accuracy, 164. 17% improvement in 5-meter accuracy, and 137. 43% improvement in 10-meter accuracy.

geo-localization

Drone Referring Localization: An Efficient Heterogeneous Spatial Feature Interaction Method For UAV Self-Localization

1 code implementation13 Aug 2022 Ming Dai, Enhui Zheng, Jiahao Chen, Lei Qi, ZhenHua Feng, Wankou Yang

However, IR-based methods face several challenges: 1) Pre- and post-processing incur significant computational and storage overhead; 2) The lack of interaction between dual-source features impairs precise spatial perception.

Image Retrieval Retrieval +1

A Transformer-Based Feature Segmentation and Region Alignment Method For UAV-View Geo-Localization

1 code implementation23 Jan 2022 Ming Dai, Jianhong Hu, Jiedong Zhuang, Enhui Zheng

However it still has some limitations, e. g., it can only extract part of the information in the neighborhood and some scale reduction operations will make some fine-grained information lost.

Drone navigation Drone-view target localization +1

Vision-Based UAV Self-Positioning in Low-Altitude Urban Environments

1 code implementation23 Jan 2022 Ming Dai, Enhui Zheng, ZhenHua Feng, Jiedong Zhuang, Wankou Yang

Last, we enhance the Recall@K metric and introduce a new measurement, SDM@K, to evaluate the performance of a trained model from both the retrieval and localization perspectives simultaneously.

geo-localization Metric Learning +1

Cannot find the paper you are looking for? You can Submit a new open access paper.