Search Results for author: Guangyao Li

Found 15 papers, 10 papers with code

Scene Text Detection with Supervised Pyramid Context Network

2 code implementations21 Nov 2018 Enze Xie, Yuhang Zang, Shuai Shao, Gang Yu, Cong Yao, Guangyao Li

We propose a supervised pyramid context network (SPCNET) to precisely locate text regions while suppressing false positives.

Instance Segmentation Scene Text Detection +2

Learning to Answer Questions in Dynamic Audio-Visual Scenarios

1 code implementation CVPR 2022 Guangyao Li, Yake Wei, Yapeng Tian, Chenliang Xu, Ji-Rong Wen, Di Hu

In this paper, we focus on the Audio-Visual Question Answering (AVQA) task, which aims to answer questions regarding different visual objects, sounds, and their associations in videos.

audio-visual learning Audio-visual Question Answering +4

Self-supervised Audiovisual Representation Learning for Remote Sensing Data

1 code implementation2 Aug 2021 Konrad Heidler, Lichao Mou, Di Hu, Pu Jin, Guangyao Li, Chuang Gan, Ji-Rong Wen, Xiao Xiang Zhu

By fine-tuning the models on a number of commonly used remote sensing datasets, we show that our approach outperforms existing pre-training strategies for remote sensing imagery.

Cross-Modal Retrieval Representation Learning +1

Multi-Scale Attention for Audio Question Answering

1 code implementation29 May 2023 Guangyao Li, Yixin Xu, Di Hu

Audio question answering (AQA), acting as a widely used proxy task to explore scene understanding, has got more attention.

Audio Question Answering Question Answering +2

I Know What You Do Not Know: Knowledge Graph Embedding via Co-distillation Learning

1 code implementation21 Aug 2022 Yang Liu, Zequn Sun, Guangyao Li, Wei Hu

To this end, we propose CoLE, a Co-distillation Learning method for KG Embedding that exploits the complementarity of graph structures and text information.

Knowledge Graph Embedding Language Modelling

Rule-Guided Graph Neural Networks for Recommender Systems

1 code implementation9 Sep 2020 Xinze Lyu, Guangyao Li, Jiacheng Huang, Wei Hu

However, existing work incorporated with KGs cannot capture the explicit long-range semantics between users and items meanwhile consider various connectivity between items.

Collaborative Filtering Knowledge Graphs +1

Progressive Spatio-temporal Perception for Audio-Visual Question Answering

1 code implementation10 Aug 2023 Guangyao Li, Wenxuan Hou, Di Hu

Such naturally multi-modal videos are composed of rich and complex dynamic audio-visual components, where most of which could be unrelated to the given questions, or even play as interference in answering the content of interest.

Audio-visual Question Answering Audio-Visual Question Answering (AVQA) +2

EventEA: Benchmarking Entity Alignment for Event-centric Knowledge Graphs

1 code implementation5 Nov 2022 Xiaobin Tian, Zequn Sun, Guangyao Li, Wei Hu

Towards a critical evaluation of embedding-based entity alignment methods, we construct a new dataset with heterogeneous relations and attributes based on event-centric KGs.

Attribute Benchmarking +2

Prompting Segmentation with Sound Is Generalizable Audio-Visual Source Localizer

1 code implementation13 Sep 2023 Yaoting Wang, Weisong Liu, Guangyao Li, Jian Ding, Di Hu, Xi Li

Never having seen an object and heard its sound simultaneously, can the model still accurately localize its visual position from the input audio?

CoLA Visual Localization

Deep Inception Generative Network for Cognitive Image Inpainting

no code implementations1 Dec 2018 Qingguo Xiao, Guangyao Li, Qiaochuan Chen

Recent advances in deep learning have shown exciting promise in filling large holes and lead to another orientation for image inpainting.

Attribute Image Inpainting

WegFormer: Transformers for Weakly Supervised Semantic Segmentation

no code implementations16 Mar 2022 Chunmeng Liu, Enze Xie, Wenjia Wang, Wenhai Wang, Guangyao Li, Ping Luo

Although convolutional neural networks (CNNs) have achieved remarkable progress in weakly supervised semantic segmentation (WSSS), the effective receptive field of CNN is insufficient to capture global context information, leading to sub-optimal results.

Segmentation Weakly supervised Semantic Segmentation +1

Heterogeneous Federated Knowledge Graph Embedding Learning and Unlearning

no code implementations4 Feb 2023 Xiangrong Zhu, Guangyao Li, Wei Hu

To cope with the drift between local optimization and global convergence caused by data heterogeneity, we propose mutual knowledge distillation to transfer local knowledge to global, and absorb global knowledge back.

Federated Learning Knowledge Distillation +2

CM-PIE: Cross-modal perception for interactive-enhanced audio-visual video parsing

no code implementations11 Oct 2023 Yaru Chen, Ruohao Guo, Xubo Liu, Peipei Wu, Guangyao Li, Zhenbo Li, Wenwu Wang

Audio-visual video parsing is the task of categorizing a video at the segment level with weak labels, and predicting them as audible or visible events.

Cannot find the paper you are looking for? You can Submit a new open access paper.