Search Results for author: Juan-Pablo Caceres

Found 4 papers, 3 papers with code

Transcription free filler word detection with Neural semi-CRFs

1 code implementation • 11 Mar 2023 • Ge Zhu, Yujia Yan, Juan-Pablo Caceres, Zhiyao Duan

Non-linguistic filler words, such as "uh" or "um", are prevalent in spontaneous speech and serve as indicators for expressing hesitation or uncertainty.

Automatic Speech Recognition Automatic Speech Recognition (ASR) +1

Paper
Code

Filler Word Detection and Classification: A Dataset and Benchmark

1 code implementation • 28 Mar 2022 • Ge Zhu, Juan-Pablo Caceres, Justin Salamon

In this work, we present a novel speech dataset, PodcastFillers, with 35K annotated filler words and 50K annotations of other sounds that commonly occur in podcasts such as breaths, laughter, and word repetitions.

Ranked #1 on Sound Event Localization and Detection on PodcastFillers

Classification Keyword Spotting +1

Paper
Code

Neural Pitch-Shifting and Time-Stretching with Controllable LPCNet

1 code implementation • 5 Oct 2021 • Max Morrison, Zeyu Jin, Nicholas J. Bryan, Juan-Pablo Caceres, Bryan Pardo

Modifying the pitch and timing of an audio signal are fundamental audio editing operations with applications in speech manipulation, audio-visual synchronization, and singing voice editing and synthesis.

Audio-Visual Synchronization

138

Paper
Code

Context-Aware Prosody Correction for Text-Based Speech Editing

no code implementations • 16 Feb 2021 • Max Morrison, Lucas Rencker, Zeyu Jin, Nicholas J. Bryan, Juan-Pablo Caceres, Bryan Pardo

Text-based speech editors expedite the process of editing speech recordings by permitting editing via intuitive cut, copy, and paste operations on a speech transcript.

Denoising

Paper
Add Code

Cannot find the paper you are looking for? You can Submit a new open access paper.