Search Results for author: Bilei Zhu

Found 11 papers, 5 papers with code

Joint Music and Language Attention Models for Zero-shot Music Tagging

no code implementations16 Oct 2023 Xingjian Du, Zhesong Yu, Jiaju Lin, Bilei Zhu, Qiuqiang Kong

However, previous music tagging research primarily focuses on close-set music tagging tasks which can not be generalized to new tags.

Audio Tagging Music Tagging

ByteCover3: Accurate Cover Song Identification on Short Queries

no code implementations21 Mar 2023 Xingjian Du, Zijie Wang, Xia Liang, Huidong Liang, Bilei Zhu, Zejun Ma

Deep learning based methods have become a paradigm for cover song identification (CSI) in recent years, where the ByteCover systems have achieved state-of-the-art results on all the mainstream datasets of CSI.

Cover song identification Retrieval

Graph Contrastive Learning with Implicit Augmentations

1 code implementation7 Nov 2022 Huidong Liang, Xingjian Du, Bilei Zhu, Zejun Ma, Ke Chen, Junbin Gao

Existing graph contrastive learning methods rely on augmentation techniques based on random perturbations (e. g., randomly adding or dropping edges and nodes).

Contrastive Learning Graph Classification +1

HTS-AT: A Hierarchical Token-Semantic Audio Transformer for Sound Classification and Detection

1 code implementation2 Feb 2022 Ke Chen, Xingjian Du, Bilei Zhu, Zejun Ma, Taylor Berg-Kirkpatrick, Shlomo Dubnov

To combat these problems, we introduce HTS-AT: an audio transformer with a hierarchical structure to reduce the model size and training time.

Audio Classification Event Detection +3

ByteCover: Cover Song Identification via Multi-Loss Training

1 code implementation27 Oct 2020 Xingjian Du, Zhesong Yu, Bilei Zhu, Xiaoou Chen, Zejun Ma

We present in this paper ByteCover, which is a new feature learning method for cover song identification (CSI).

Cover song identification

Rule-embedded network for audio-visual voice activity detection in live musical video streams

1 code implementation27 Oct 2020 Yuanbo Hou, Yi Deng, Bilei Zhu, Zejun Ma, Dick Botteldooren

Detecting anchor's voice in live musical streams is an important preprocessing for music and speech signal processing.

Sound Multimedia Audio and Speech Processing

Contrastive Unsupervised Learning for Audio Fingerprinting

no code implementations26 Oct 2020 Zhesong Yu, Xingjian Du, Bilei Zhu, Zejun Ma

The rise of video-sharing platforms has attracted more and more people to shoot videos and upload them to the Internet.

Contrastive Learning

Cannot find the paper you are looking for? You can Submit a new open access paper.