Automatic Phoneme Recognition

1 papers with code • 6 benchmarks • 6 datasets

Automatic Phoneme Recognition (APR) involves converting spoken language into a sequence of phonemes, which are the distinct units of sound that distinguish one word from another in a given language. It is designed to transcribe spoken words into their textual phonetic representations in real-time, enabling detailed analysis of speech patterns, pronunciation, and linguistic nuances. The goal of Automatic Phoneme Recognition is to accurately identify and transcribe phonemes, considering variations in accent, pronunciation, and speaking style, as well as background noise and other factors that can affect speech quality. This technology is crucial for linguistic research, speech therapy, language learning applications, and enhancing the performance of speech recognition systems.

Most implemented papers

Vibravox: A Dataset of French Speech Captured with Body-conduction Audio Sensors

jhauret/vibravox 16 Jul 2024

Vibravox is a dataset compliant with the General Data Protection Regulation (GDPR) containing audio recordings using five different body-conduction audio sensors : two in-ear microphones, two bone conduction vibration pickups and a laryngophone.