Search Results for author: Liqi Yan

Found 7 papers, 4 papers with code

Solve the Puzzle of Instance Segmentation in Videos: A Weakly Supervised Framework with Spatio-Temporal Collaboration

no code implementations • 15 Dec 2022 • Liqi Yan, Qifan Wang, Siqi Ma, Jingang Wang, Changbin Yu

Instance segmentation in videos, which aims to segment and track multiple objects in video frames, has garnered a flurry of research attention in recent years.

Depth Estimation Instance Segmentation +3

Paper
Add Code

GL-RG: Global-Local Representation Granularity for Video Captioning

1 code implementation • 22 May 2022 • Liqi Yan, Qifan Wang, Yiming Cui, Fuli Feng, Xiaojun Quan, Xiangyu Zhang, Dongfang Liu

Video captioning is a challenging task as it needs to accurately transform visual understanding into natural language description.

Caption Generation Descriptive +1

Paper
Code

TF-Blender: Temporal Feature Blender for Video Object Detection

1 code implementation • ICCV 2021 • Yiming Cui, Liqi Yan, Zhiwen Cao, Dongfang Liu

One of the popular solutions is to exploit the temporal information and enhance per-frame representation through aggregating features from neighboring frames.

Object object-detection +1

Paper
Code

Hierarchical Attention Fusion for Geo-Localization

1 code implementation • 18 Feb 2021 • Liqi Yan, Yiming Cui, Yingjie Chen, Dongfang Liu

We extract the hierarchical feature maps from a convolutional neural network (CNN) and organically fuse the extracted features for image representations.

Image Retrieval Retrieval

Paper
Code

DenserNet: Weakly Supervised Visual Localization Using Multi-scale Feature Aggregation

1 code implementation • 4 Dec 2020 • Dongfang Liu, Yiming Cui, Liqi Yan, Christos Mousas, Baijian Yang, Yingjie Chen

In this work, we introduce a Denser Feature Network (DenserNet) for visual localization.

Image Retrieval Retrieval +1

Paper
Code

Multimodal Aggregation Approach for Memory Vision-Voice Indoor Navigation with Meta-Learning

no code implementations • 1 Sep 2020 • Liqi Yan, Dongfang Liu, Yaoxian Song, Changbin Yu

Memory is important for the agent to avoid repeating certain tasks unnecessarily and in order for it to adapt adequately to new scenes, therefore, we make use of meta-learning.

Ranked #1 on Visual Navigation on AI2-THOR

Meta-Learning Visual Navigation

Paper
Add Code

Crowd Video Captioning

no code implementations • 13 Nov 2019 • Liqi Yan, Mingjian Zhu, Changbin Yu

Since the deployment of reporters in the entrance and exit costs lots of manpower, how to automatically describe the behavior of a crowd of off-site spectators is significant and remains a problem.

Video Captioning

Paper
Add Code

Cannot find the paper you are looking for? You can Submit a new open access paper.