Search Results for author: Ge Zhu

Found 9 papers, 9 papers with code

Transcription free filler word detection with Neural semi-CRFs

1 code implementation • 11 Mar 2023 • Ge Zhu, Yujia Yan, Juan-Pablo Caceres, Zhiyao Duan

Non-linguistic filler words, such as "uh" or "um", are prevalent in spontaneous speech and serve as indicators for expressing hesitation or uncertainty.

Automatic Speech Recognition Automatic Speech Recognition (ASR) +1

Paper
Code

Sharp Eyes: A Salient Object Detector Working The Same Way as Human Visual Characteristics

1 code implementation • 18 Jan 2023 • Ge Zhu, Jinbao Li, Yahong Guo

In the OS branch, we first aggregate multi-level features to adaptively select complementary components, and then feed the saliency features with expanded boundary into aggregated features to guide the network obtain complete prediction.

Semantic Segmentation

Paper
Code

Music Source Separation with Generative Flow

1 code implementation • 19 Apr 2022 • Ge Zhu, Jordan Darefsky, Fei Jiang, Anton Selitskiy, Zhiyao Duan

Fully-supervised models for source separation are trained on parallel mixture-source data and are currently state-of-the-art.

Music Source Separation

Paper
Code

Filler Word Detection and Classification: A Dataset and Benchmark

1 code implementation • 28 Mar 2022 • Ge Zhu, Juan-Pablo Caceres, Justin Salamon

In this work, we present a novel speech dataset, PodcastFillers, with 35K annotated filler words and 50K annotations of other sounds that commonly occur in podcasts such as breaths, laughter, and word repetitions.

Ranked #1 on Sound Event Localization and Detection on PodcastFillers

Classification Keyword Spotting +1

Paper
Code

A Probabilistic Fusion Framework for Spoofing Aware Speaker Verification

1 code implementation • 10 Feb 2022 • You Zhang, Ge Zhu, Zhiyao Duan

We further propose fusion strategies for direct inference and fine-tuning to predict the SASV score based on the framework.

Speaker Verification

Paper
Code

A study of the robustness of raw waveform based speaker embeddings under mismatched conditions

1 code implementation • 8 Oct 2021 • Ge Zhu, Frank Cwitkowitz, Zhiyao Duan

In this paper, we conduct a cross-dataset study on parametric and non-parametric raw-waveform based speaker embeddings through speaker verification experiments.

Speaker Verification

Paper
Code

UR Channel-Robust Synthetic Speech Detection System for ASVspoof 2021

2 code implementations • 26 Jul 2021 • Xinhui Chen, You Zhang, Ge Zhu, Zhiyao Duan

Different from previous ASVspoof challenges, the LA task this year presents codec and transmission channel variability, while the new task DF presents general audio compression.

Audio Compression Face Swapping +2