Search Results for author: Peter Wu

Found 18 papers, 15 papers with code

CiwaGAN: Articulatory information exchange

1 code implementation14 Sep 2023 Gašper Beguš, Thomas Lu, Alan Zhou, Peter Wu, Gopala K. Anumanchipalli

This paper introduces CiwaGAN, a model of human spoken language acquisition that combines unsupervised articulatory modeling with an unsupervised model of information exchange through the auditory modality.

Language Acquisition

Deep Speech Synthesis from MRI-Based Articulatory Representations

1 code implementation5 Jul 2023 Peter Wu, Tingle Li, Yijing Lu, Yubin Zhang, Jiachen Lian, Alan W Black, Louis Goldstein, Shinji Watanabe, Gopala K. Anumanchipalli

Finally, through a series of ablations, we show that the proposed MRI representation is more comprehensive than EMA and identify the most suitable MRI feature subset for articulatory synthesis.

Computational Efficiency Denoising +1

Speaker-Independent Acoustic-to-Articulatory Speech Inversion

1 code implementation14 Feb 2023 Peter Wu, Li-Wei Chen, Cheol Jun Cho, Shinji Watanabe, Louis Goldstein, Alan W Black, Gopala K. Anumanchipalli

To build speech processing methods that can handle speech as naturally as humans, researchers have explored multiple ways of building an invertible mapping from speech to an interpretable space.

Resynthesis

Articulation GAN: Unsupervised modeling of articulatory learning

1 code implementation27 Oct 2022 Gašper Beguš, Alan Zhou, Peter Wu, Gopala K Anumanchipalli

Articulatory analysis suggests that the network learns to control articulators in a similar manner to humans during speech production.

Generative Adversarial Network Speech Synthesis

A Fast and Accurate Pitch Estimation Algorithm Based on the Pseudo Wigner-Ville Distribution

no code implementations27 Oct 2022 Yisi Liu, Peter Wu, Alan W Black, Gopala K. Anumanchipalli

Estimation of fundamental frequency (F0) in voiced segments of speech signals, also known as pitch tracking, plays a crucial role in pitch synchronous speech analysis, speech synthesis, and speech manipulation.

Speech Synthesis

Evidence of Vocal Tract Articulation in Self-Supervised Learning of Speech

1 code implementation21 Oct 2022 Cheol Jun Cho, Peter Wu, Abdelrahman Mohamed, Gopala K. Anumanchipalli

Recent self-supervised learning (SSL) models have proven to learn rich representations of speech, which can readily be utilized by diverse downstream tasks.

Self-Supervised Learning

Deep Speech Synthesis from Articulatory Representations

1 code implementation13 Sep 2022 Peter Wu, Shinji Watanabe, Louis Goldstein, Alan W Black, Gopala K. Anumanchipalli

In the articulatory synthesis task, speech is synthesized from input features containing information about the physical behavior of the human vocal tract.

Speech Synthesis

PACS: A Dataset for Physical Audiovisual CommonSense Reasoning

1 code implementation21 Mar 2022 Samuel Yu, Peter Wu, Paul Pu Liang, Ruslan Salakhutdinov, Louis-Philippe Morency

Our paper takes a step towards real-world physical commonsense reasoning by contributing PACS: the first audiovisual benchmark annotated for physical commonsense attributes.

Multimodal Reasoning Physical Commonsense Reasoning

Cross-lingual Transfer for Speech Processing using Acoustic Language Similarity

1 code implementation2 Nov 2021 Peter Wu, Jiatong Shi, Yifan Zhong, Shinji Watanabe, Alan W Black

We demonstrate the effectiveness of our approach in language family classification, speech recognition, and speech synthesis tasks.

Cross-Lingual Transfer speech-recognition +2

MultiBench: Multiscale Benchmarks for Multimodal Representation Learning

2 code implementations15 Jul 2021 Paul Pu Liang, Yiwei Lyu, Xiang Fan, Zetian Wu, Yun Cheng, Jason Wu, Leslie Chen, Peter Wu, Michelle A. Lee, Yuke Zhu, Ruslan Salakhutdinov, Louis-Philippe Morency

In order to accelerate progress towards understudied modalities and tasks while ensuring real-world robustness, we release MultiBench, a systematic and unified large-scale benchmark spanning 15 datasets, 10 modalities, 20 prediction tasks, and 6 research areas.

Representation Learning

Cross-Modal Generalization: Learning in Low Resource Modalities via Meta-Alignment

1 code implementation4 Dec 2020 Paul Pu Liang, Peter Wu, Liu Ziyin, Louis-Philippe Morency, Ruslan Salakhutdinov

In this work, we propose algorithms for cross-modal generalization: a learning paradigm to train a model that can (1) quickly perform new tasks in a target modality (i. e. meta-learning) and (2) doing so while being trained on a different source modality.

Meta-Learning

Automatically Identifying Language Family from Acoustic Examples in Low Resource Scenarios

1 code implementation1 Dec 2020 Peter Wu, Yifan Zhong, Alan W Black

Existing multilingual speech NLP works focus on a relatively small subset of languages, and thus current linguistic understanding of languages predominantly stems from classical approaches.

Data Augmentation

LEAF: A Benchmark for Federated Settings

7 code implementations3 Dec 2018 Sebastian Caldas, Sai Meher Karthik Duddu, Peter Wu, Tian Li, Jakub Konečný, H. Brendan McMahan, Virginia Smith, Ameet Talwalkar

Modern federated networks, such as those comprised of wearable devices, mobile phones, or autonomous vehicles, generate massive amounts of data each day.

Autonomous Vehicles Benchmarking +3

Machine Learning for Exam Triage

1 code implementation30 Apr 2018 Xinyu Guan, Jessica Lee, Peter Wu, Yue Wu

In this project, we extend the state-of-the-art CheXNet (Rajpurkar et al. [2017]) by making use of the additional non-image features in the dataset.

BIG-bench Machine Learning

Cannot find the paper you are looking for? You can Submit a new open access paper.