Search Results for author: Guangyao Li

Found 15 papers, 10 papers with code

Scene Text Detection with Supervised Pyramid Context Network

2 code implementations • 21 Nov 2018 • Enze Xie, Yuhang Zang, Shuai Shao, Gang Yu, Cong Yao, Guangyao Li

We propose a supervised pyramid context network (SPCNET) to precisely locate text regions while suppressing false positives.

Ranked #2 on Scene Text Detection on ICDAR 2013

Instance Segmentation Scene Text Detection +2

120

Paper
Code

Learning to Answer Questions in Dynamic Audio-Visual Scenarios

1 code implementation • CVPR 2022 • Guangyao Li, Yake Wei, Yapeng Tian, Chenliang Xu, Ji-Rong Wen, Di Hu

In this paper, we focus on the Audio-Visual Question Answering (AVQA) task, which aims to answer questions regarding different visual objects, sounds, and their associations in videos.

Ranked #5 on Audio-visual Question Answering on MUSIC-AVQA

audio-visual learning Audio-visual Question Answering +4

Paper
Code

Self-supervised Audiovisual Representation Learning for Remote Sensing Data

1 code implementation • 2 Aug 2021 • Konrad Heidler, Lichao Mou, Di Hu, Pu Jin, Guangyao Li, Chuang Gan, Ji-Rong Wen, Xiao Xiang Zhu

By fine-tuning the models on a number of commonly used remote sensing datasets, we show that our approach outperforms existing pre-training strategies for remote sensing imagery.

Ranked #2 on Cross-Modal Retrieval on SoundingEarth

Cross-Modal Retrieval Representation Learning +1

Paper
Code

Multi-Scale Attention for Audio Question Answering

1 code implementation • 29 May 2023 • Guangyao Li, Yixin Xu, Di Hu

Audio question answering (AQA), acting as a widely used proxy task to explore scene understanding, has got more attention.

Audio Question Answering Question Answering +2

Paper
Code

I Know What You Do Not Know: Knowledge Graph Embedding via Co-distillation Learning

1 code implementation • 21 Aug 2022 • Yang Liu, Zequn Sun, Guangyao Li, Wei Hu

To this end, we propose CoLE, a Co-distillation Learning method for KG Embedding that exploits the complementarity of graph structures and text information.

Knowledge Graph Embedding Language Modelling

Paper
Code

Rule-Guided Graph Neural Networks for Recommender Systems

1 code implementation • 9 Sep 2020 • Xinze Lyu, Guangyao Li, Jiacheng Huang, Wei Hu

However, existing work incorporated with KGs cannot capture the explicit long-range semantics between users and items meanwhile consider various connectivity between items.

Collaborative Filtering Knowledge Graphs +1

Paper
Code

Progressive Spatio-temporal Perception for Audio-Visual Question Answering

1 code implementation • 10 Aug 2023 • Guangyao Li, Wenxuan Hou, Di Hu

Such naturally multi-modal videos are composed of rich and complex dynamic audio-visual components, where most of which could be unrelated to the given questions, or even play as interference in answering the content of interest.

Ranked #2 on Audio-Visual Question Answering (AVQA) on AVQA

Audio-visual Question Answering Audio-Visual Question Answering (AVQA) +2

Paper
Code

EventEA: Benchmarking Entity Alignment for Event-centric Knowledge Graphs

1 code implementation • 5 Nov 2022 • Xiaobin Tian, Zequn Sun, Guangyao Li, Wei Hu

Towards a critical evaluation of embedding-based entity alignment methods, we construct a new dataset with heterogeneous relations and attributes based on event-centric KGs.

Attribute Benchmarking +2

Paper
Code

MECPformer: Multi-estimations Complementary Patch with CNN-Transformers for Weakly Supervised Semantic Segmentation

1 code implementation • 19 Mar 2023 • Chunmeng Liu, Guangyao Li, Yao Shen, Ruiqi Wang

Given a class, the initial seeds generated based on the transformer may invade regions belonging to other classes.

Weakly supervised Semantic Segmentation Weakly-Supervised Semantic Segmentation

Paper
Code

Prompting Segmentation with Sound Is Generalizable Audio-Visual Source Localizer

1 code implementation • 13 Sep 2023 • Yaoting Wang, Weisong Liu, Guangyao Li, Jian Ding, Di Hu, Xi Li

Never having seen an object and heard its sound simultaneously, can the model still accurately localize its visual position from the input audio?

CoLA Visual Localization

Paper
Code

Real-world plant species identification based on deep convolutional neural networks and visual attention

no code implementations • 11 Apr 2018 • Qingguo Xiao, Guangyao Li, Li Xie, Qiaochuan Chen

We propose a novel framework and an effective data augmentation method for deep learning in this paper.

Data Augmentation

Paper
Add Code

Deep Inception Generative Network for Cognitive Image Inpainting

no code implementations • 1 Dec 2018 • Qingguo Xiao, Guangyao Li, Qiaochuan Chen

Recent advances in deep learning have shown exciting promise in filling large holes and lead to another orientation for image inpainting.

Attribute Image Inpainting

Paper
Add Code

WegFormer: Transformers for Weakly Supervised Semantic Segmentation

no code implementations • 16 Mar 2022 • Chunmeng Liu, Enze Xie, Wenjia Wang, Wenhai Wang, Guangyao Li, Ping Luo

Although convolutional neural networks (CNNs) have achieved remarkable progress in weakly supervised semantic segmentation (WSSS), the effective receptive field of CNN is insufficient to capture global context information, leading to sub-optimal results.

Segmentation Weakly supervised Semantic Segmentation +1

Paper
Add Code

Heterogeneous Federated Knowledge Graph Embedding Learning and Unlearning

no code implementations • 4 Feb 2023 • Xiangrong Zhu, Guangyao Li, Wei Hu

To cope with the drift between local optimization and global convergence caused by data heterogeneity, we propose mutual knowledge distillation to transfer local knowledge to global, and absorb global knowledge back.

Federated Learning Knowledge Distillation +2

Paper
Add Code

CM-PIE: Cross-modal perception for interactive-enhanced audio-visual video parsing

no code implementations • 11 Oct 2023 • Yaru Chen, Ruohao Guo, Xubo Liu, Peipei Wu, Guangyao Li, Zhenbo Li, Wenwu Wang

Audio-visual video parsing is the task of categorizing a video at the segment level with weak labels, and predicting them as audible or visible events.

Paper
Add Code

Cannot find the paper you are looking for? You can Submit a new open access paper.