Search Results for author: Ge Zhu

Found 9 papers, 9 papers with code

Transcription free filler word detection with Neural semi-CRFs

1 code implementation11 Mar 2023 Ge Zhu, Yujia Yan, Juan-Pablo Caceres, Zhiyao Duan

Non-linguistic filler words, such as "uh" or "um", are prevalent in spontaneous speech and serve as indicators for expressing hesitation or uncertainty.

Automatic Speech Recognition Automatic Speech Recognition (ASR) +1

Sharp Eyes: A Salient Object Detector Working The Same Way as Human Visual Characteristics

1 code implementation18 Jan 2023 Ge Zhu, Jinbao Li, Yahong Guo

In the OS branch, we first aggregate multi-level features to adaptively select complementary components, and then feed the saliency features with expanded boundary into aggregated features to guide the network obtain complete prediction.

Semantic Segmentation

Music Source Separation with Generative Flow

1 code implementation19 Apr 2022 Ge Zhu, Jordan Darefsky, Fei Jiang, Anton Selitskiy, Zhiyao Duan

Fully-supervised models for source separation are trained on parallel mixture-source data and are currently state-of-the-art.

Music Source Separation

Filler Word Detection and Classification: A Dataset and Benchmark

1 code implementation28 Mar 2022 Ge Zhu, Juan-Pablo Caceres, Justin Salamon

In this work, we present a novel speech dataset, PodcastFillers, with 35K annotated filler words and 50K annotations of other sounds that commonly occur in podcasts such as breaths, laughter, and word repetitions.

Classification Keyword Spotting +1

A Probabilistic Fusion Framework for Spoofing Aware Speaker Verification

1 code implementation10 Feb 2022 You Zhang, Ge Zhu, Zhiyao Duan

We further propose fusion strategies for direct inference and fine-tuning to predict the SASV score based on the framework.

Speaker Verification

A study of the robustness of raw waveform based speaker embeddings under mismatched conditions

1 code implementation8 Oct 2021 Ge Zhu, Frank Cwitkowitz, Zhiyao Duan

In this paper, we conduct a cross-dataset study on parametric and non-parametric raw-waveform based speaker embeddings through speaker verification experiments.

Speaker Verification

UR Channel-Robust Synthetic Speech Detection System for ASVspoof 2021

2 code implementations26 Jul 2021 Xinhui Chen, You Zhang, Ge Zhu, Zhiyao Duan

Different from previous ASVspoof challenges, the LA task this year presents codec and transmission channel variability, while the new task DF presents general audio compression.

Audio Compression Face Swapping +2

An Empirical Study on Channel Effects for Synthetic Voice Spoofing Countermeasure Systems

3 code implementations3 Apr 2021 You Zhang, Ge Zhu, Fei Jiang, Zhiyao Duan

Spoofing countermeasure (CM) systems are critical in speaker verification; they aim to discern spoofing attacks from bona fide speech trials.

Data Augmentation Multi-Task Learning +2

Y-Vector: Multiscale Waveform Encoder for Speaker Embedding

1 code implementation24 Oct 2020 Ge Zhu, Fei Jiang, Zhiyao Duan

State-of-the-art text-independent speaker verification systems typically use cepstral features or filter bank energies as speech features.

Text-Independent Speaker Verification

Cannot find the paper you are looking for? You can Submit a new open access paper.