Search Results for author: Xiangdong Wang

Found 15 papers, 8 papers with code

Dynamic Multistep Reasoning based on Video Scene Graph for Video Question Answering

no code implementations • NAACL 2022 • Jianguo Mao, Wenbin Jiang, Xiangdong Wang, Zhifan Feng, Yajuan Lyu, Hong Liu, Yong Zhu

Then, it performs multistep reasoning for better answer decision between the representations of the question and the video, and dynamically integrate the reasoning results.

Question Answering Video Question Answering +1

Paper
Add Code

Hierarchical Representation-based Dynamic Reasoning Network for Biomedical Question Answering

no code implementations • COLING 2022 • Jianguo Mao, Jiyuan Zhang, Zengfeng Zeng, Weihua Peng, Wenbin Jiang, Xiangdong Wang, Hong Liu, Yajuan Lyu

It then performs dynamic reasoning based on the hierarchical representations of evidences to solve complex biomedical problems.

Question Answering

Paper
Add Code

Semi-supervised Sound Event Detection with Local and Global Consistency Regularization

no code implementations • 15 Sep 2023 • Yiming Li, Xiangdong Wang, Hong Liu, Rui Tao, Long Yan, Kazushige Ouchi

Then, the local consistency is adopted to encourage the model to leverage local features for frame-level predictions, and the global consistency is applied to force features to align with global prototypes through a specially designed contrastive loss.

Event Detection Sound Event Detection

Paper
Add Code

Audio-free Prompt Tuning for Language-Audio Models

no code implementations • 15 Sep 2023 • Yiming Li, Xiangdong Wang, Hong Liu

Contrastive Language-Audio Pretraining (CLAP) is pre-trained to associate audio features with human language, making it a natural zero-shot classifier to recognize unseen sound categories.

Paper
Add Code

Audio Generation with Multiple Conditional Diffusion Model

no code implementations • 23 Aug 2023 • Zhifang Guo, Jianguo Mao, Rui Tao, Long Yan, Kazushige Ouchi, Hong Liu, Xiangdong Wang

To address this issue, we propose a novel model that enhances the controllability of existing pre-trained text-to-audio models by incorporating additional conditions including content (timestamp) and style (pitch contour and energy contour) as supplements to the text.

Audio Generation Language Modelling +1

Paper
Add Code

Furnishing Sound Event Detection with Language Model Abilities

no code implementations • 22 Aug 2023 • Hualei Wang, Jianguo Mao, Zhifang Guo, Jiarui Wan, Hong Liu, Xiangdong Wang

Recently, the ability of language models (LMs) has attracted increasing attention in visual cross-modality.

Event Detection Language Modelling +1

Paper
Add Code

Compact Twice Fusion Network for Edge Detection

1 code implementation • 11 Jul 2023 • Yachuan Li, Zongmin Li, Xavier Soria P., Chaozhi Yang, Qian Xiao, Yun Bai, Hua Li, Xiangdong Wang

In this work, we propose a Compact Twice Fusion Network (CTFN) to fully integrate multi-scale features while maintaining the compactness of the model.

Edge Detection

Paper
Code

A Hybrid System of Sound Event Detection Transformer and Frame-wise Model for DCASE 2022 Task 4

1 code implementation • 18 Oct 2022 • Yiming Li, Zhifang Guo, Zhirong Ye, Xiangdong Wang, Hong Liu, Yueliang Qian, Rui Tao, Long Yan, Kazushige Ouchi

For the frame-wise model, the ICT-TOSHIBA system of DCASE 2021 Task 4 is used.

Event Detection Metric Learning +1

Paper
Code

Couple Learning for semi-supervised sound event detection

2 code implementations • 12 Oct 2021 • Rui Tao, Long Yan, Kazushige Ouchi, Xiangdong Wang

The recently proposed Mean Teacher method, which exploits large-scale unlabeled data in a self-ensembling manner, has achieved state-of-the-art results in several semi-supervised learning benchmarks.

Event Detection Sound Event Detection

Paper
Code

Sound Event Detection Transformer: An Event-based End-to-End Model for Sound Event Detection

1 code implementation • 5 Oct 2021 • Zhirong Ye, Xiangdong Wang, Hong Liu, Yueliang Qian, Rui Tao, Long Yan, Kazushige Ouchi

A critical issue with the frame-based model is that it pursues the best frame-level prediction rather than the best event-level prediction.

Audio Tagging Boundary Detection +5

Paper
Code

An End-to-end Approach for Lexical Stress Detection based on Transformer

no code implementations • 6 Nov 2019 • Yong Ruan, Xiangdong Wang, Hong Liu, Zhigang Ou, Yun Gao, Jianfeng Cheng, Yueliang Qian

For this, we train transformer model using feature sequence of audio and their phoneme sequence with lexical stress marks.

General Classification

Paper
Add Code

Guided Learning Convolution System for DCASE 2019 Task 4

1 code implementation • 11 Sep 2019 • Liwei Lin, Xiangdong Wang, Hong Liu, Yueliang Qian

In this paper, we describe in detail the system we submitted to DCASE2019 task 4: sound event detection (SED) in domestic environments.

Event Detection Sound Event Detection

118

Paper
Code

Guided learning for weakly-labeled semi-supervised sound event detection

1 code implementation • 6 Jun 2019 • Liwei Lin, Xiangdong Wang, Hong Liu, Yueliang Qian

Instead of designing a single model by considering a trade-off between the two sub-targets, we design a teacher model aiming at audio tagging to guide a student model aiming at boundary detection to learn using the unlabeled data.

Audio Tagging Boundary Detection +3

118

Paper
Code

Specialized Decision Surface and Disentangled Feature for Weakly-Supervised Polyphonic Sound Event Detection

1 code implementation • 24 May 2019 • Liwei Lin, Xiangdong Wang, Hong Liu, Yueliang Qian

In this paper, a special decision surface for the weakly-supervised sound event detection (SED) and a disentangled feature (DF) for the multi-label problem in polyphonic SED are proposed.

Event Detection Multi-Label Classification +2

118

Paper
Code

Action Recognition Based on Optimal Joint Selection and Discriminative Depth Descriptor

1 code implementation • The 13th Asian Conference on Computer Vision 2016 • Haomiao Ni, Hong Liu, Xiangdong Wang, Yueliang Qian

This paper proposes a novel human action recognition using the decision-level fusion of both skeleton and depth sequence.

Action Recognition Dynamic Time Warping +1

Paper
Code

Cannot find the paper you are looking for? You can Submit a new open access paper.