1 code implementation • 25 Sep 2023 • Di Liang, Nian Shao, Xiaofei Li
This work proposes a frame-wise online/streaming end-to-end neural diarization (FS-EEND) method in a frame-in-frame-out fashion.
1 code implementation • 15 Sep 2023 • Nian Shao, Xian Li, Xiaofei Li
In this work, we study the fine-tuning method of the pretrained models for SED.
Ranked #1 on Sound Event Detection on DESED (using extra training data)
2 code implementations • 7 Jun 2023 • Xian Li, Nian Shao, Xiaofei Li
In order to tackle both clip-level and frame-level tasks, this paper proposes Audio Teacher-Student Transformer (ATST), with a clip-level version (named ATST-Clip) and a frame-level version (named ATST-Frame), responsible for learning clip-level and frame-level representations, respectively.
Ranked #8 on Audio Classification on AudioSet
2 code implementations • 21 Oct 2021 • Nian Shao, Erfan Loweimi, Xiaofei Li
Sound event detection (SED), as a core module of acoustic environmental analysis, suffers from the problem of data deficiency.
Ranked #5 on Sound Event Detection on DESED