Search Results for author: Liqi Yan

Found 8 papers, 4 papers with code

Radiance Field Learners As UAV First-Person Viewers

no code implementations10 Aug 2024 Liqi Yan, Qifan Wang, Junhan Zhao, Qiang Guan, Zheng Tang, Jianhui Zhang, Dongfang Liu

First-Person-View (FPV) holds immense potential for revolutionizing the trajectory of Unmanned Aerial Vehicles (UAVs), offering an exhilarating avenue for navigating complex building structures.

Solve the Puzzle of Instance Segmentation in Videos: A Weakly Supervised Framework with Spatio-Temporal Collaboration

no code implementations15 Dec 2022 Liqi Yan, Qifan Wang, Siqi Ma, Jingang Wang, Changbin Yu

Instance segmentation in videos, which aims to segment and track multiple objects in video frames, has garnered a flurry of research attention in recent years.

Depth Estimation Instance Segmentation +3

GL-RG: Global-Local Representation Granularity for Video Captioning

1 code implementation22 May 2022 Liqi Yan, Qifan Wang, Yiming Cui, Fuli Feng, Xiaojun Quan, Xiangyu Zhang, Dongfang Liu

Video captioning is a challenging task as it needs to accurately transform visual understanding into natural language description.

Caption Generation Descriptive +1

TF-Blender: Temporal Feature Blender for Video Object Detection

1 code implementation ICCV 2021 Yiming Cui, Liqi Yan, Zhiwen Cao, Dongfang Liu

One of the popular solutions is to exploit the temporal information and enhance per-frame representation through aggregating features from neighboring frames.

Object object-detection +1

Hierarchical Attention Fusion for Geo-Localization

1 code implementation18 Feb 2021 Liqi Yan, Yiming Cui, Yingjie Chen, Dongfang Liu

We extract the hierarchical feature maps from a convolutional neural network (CNN) and organically fuse the extracted features for image representations.

Image Retrieval Retrieval

Multimodal Aggregation Approach for Memory Vision-Voice Indoor Navigation with Meta-Learning

no code implementations1 Sep 2020 Liqi Yan, Dongfang Liu, Yaoxian Song, Changbin Yu

Memory is important for the agent to avoid repeating certain tasks unnecessarily and in order for it to adapt adequately to new scenes, therefore, we make use of meta-learning.

Meta-Learning Visual Navigation

Crowd Video Captioning

no code implementations13 Nov 2019 Liqi Yan, Mingjian Zhu, Changbin Yu

Since the deployment of reporters in the entrance and exit costs lots of manpower, how to automatically describe the behavior of a crowd of off-site spectators is significant and remains a problem.

Video Captioning

Cannot find the paper you are looking for? You can Submit a new open access paper.