no code implementations • 16 Oct 2023 • Cheol Jun Cho, Abdelrahman Mohamed, Alan W Black, Gopala K. Anumanchipalli
Self-Supervised Learning (SSL) based models of speech have shown remarkable performance on a range of downstream tasks.
1 code implementation • 16 Oct 2023 • Cheol Jun Cho, Abdelrahman Mohamed, Shang-Wen Li, Alan W Black, Gopala K. Anumanchipalli
Data-driven unit discovery in self-supervised learning (SSL) of speech has embarked on a new era of spoken language processing.
1 code implementation • 14 Sep 2023 • Gašper Beguš, Thomas Lu, Alan Zhou, Peter Wu, Gopala K. Anumanchipalli
This paper introduces CiwaGAN, a model of human spoken language acquisition that combines unsupervised articulatory modeling with an unsupervised model of information exchange through the auditory modality.
no code implementations • 12 Aug 2023 • Cheol Jun Cho, Edward F. Chang, Gopala K. Anumanchipalli
The proposed framework learns more cross-trial consistent representations than the baselines, and when visualized, the manifold reveals shared neural trajectories across trials.
1 code implementation • 5 Jul 2023 • Peter Wu, Tingle Li, Yijing Lu, Yubin Zhang, Jiachen Lian, Alan W Black, Louis Goldstein, Shinji Watanabe, Gopala K. Anumanchipalli
Finally, through a series of ablations, we show that the proposed MRI representation is more comprehensive than EMA and identify the most suitable MRI feature subset for articulatory synthesis.
1 code implementation • 14 Feb 2023 • Peter Wu, Li-Wei Chen, Cheol Jun Cho, Shinji Watanabe, Louis Goldstein, Alan W Black, Gopala K. Anumanchipalli
To build speech processing methods that can handle speech as naturally as humans, researchers have explored multiple ways of building an invertible mapping from speech to an interpretable space.
no code implementations • 29 Oct 2022 • Jiachen Lian, Alan W Black, Yijing Lu, Louis Goldstein, Shinji Watanabe, Gopala K. Anumanchipalli
In this work, we propose a novel articulatory representation decomposition algorithm that takes the advantage of guided factor analysis to derive the articulatory-specific factors and factor scores.
no code implementations • 27 Oct 2022 • Yisi Liu, Peter Wu, Alan W Black, Gopala K. Anumanchipalli
Estimation of fundamental frequency (F0) in voiced segments of speech signals, also known as pitch tracking, plays a crucial role in pitch synchronous speech analysis, speech synthesis, and speech manipulation.
1 code implementation • 21 Oct 2022 • Cheol Jun Cho, Peter Wu, Abdelrahman Mohamed, Gopala K. Anumanchipalli
Recent self-supervised learning (SSL) models have proven to learn rich representations of speech, which can readily be utilized by diverse downstream tasks.
1 code implementation • 13 Sep 2022 • Peter Wu, Shinji Watanabe, Louis Goldstein, Alan W Black, Gopala K. Anumanchipalli
In the articulatory synthesis task, speech is synthesized from input features containing information about the physical behavior of the human vocal tract.
no code implementations • 3 Sep 2019 • Pengfei Sun, Gopala K. Anumanchipalli, Edward F. Chang
These results set a new state-of-the-art on decoding text from brain and demonstrate the potential of Brain2Char as a high-performance communication BCI.