Search Results for author: William Chen

Found 36 papers, 10 papers with code

The UCF Systems for the LoResMT 2021 Machine Translation Shared Task

no code implementations MTSummit 2021 William Chen, Brett Fazio

We present the University of Central Florida systems for the LoResMT 2021 Shared Task, participating in the English-Irish and English-Marathi translation pairs.

Machine Translation Transfer Learning +1

Morphologically-Guided Segmentation For Translation of Agglutinative Low-Resource Languages

1 code implementation MTSummit 2021 William Chen, Brett Fazio

Neural Machine Translation (NMT) for Low Resource Languages (LRL) is often limited by the lack of available training data, making it necessary to explore additional techniques to improve translation quality.

Machine Translation NMT +2

ESPnet-EZ: Python-only ESPnet for Easy Fine-tuning and Integration

no code implementations14 Sep 2024 Masao Someki, Kwanghee Choi, Siddhant Arora, William Chen, Samuele Cornell, Jionghao Han, Yifan Peng, Jiatong Shi, Vaibhav Srivastav, Shinji Watanabe

We introduce ESPnet-EZ, an extension of the open-source speech processing toolkit ESPnet, aimed at quick and easy development of speech models.

CMU's IWSLT 2024 Simultaneous Speech Translation System

no code implementations14 Aug 2024 Xi Xu, Siqi Ouyang, Brian Yan, Patrick Fernandes, William Chen, Lei LI, Graham Neubig, Shinji Watanabe

This paper describes CMU's submission to the IWSLT 2024 Simultaneous Speech Translation (SST) task for translating English speech to German text in a streaming manner.

Decoder Translation

Robotic Control via Embodied Chain-of-Thought Reasoning

no code implementations11 Jul 2024 Michał Zawalski, William Chen, Karl Pertsch, Oier Mees, Chelsea Finn, Sergey Levine

Recent works on vision-language-action models (VLAs) have shown that the use of large, internet pre-trained vision-language models as the backbone of learned robot policies can substantially improve their robustness and generalization ability.

Nollywood: Let's Go to the Movies!

no code implementations2 Jul 2024 John E. Ortega, Ibrahim Said Ahmad, William Chen

Nollywood, based on the idea of Bollywood from India, is a series of outstanding movies that originate from Nigeria.

Towards Robust Speech Representation Learning for Thousands of Languages

no code implementations30 Jun 2024 William Chen, Wangyou Zhang, Yifan Peng, Xinjian Li, Jinchuan Tian, Jiatong Shi, Xuankai Chang, Soumi Maiti, Karen Livescu, Shinji Watanabe

We propose XEUS, a Cross-lingual Encoder for Universal Speech, trained on over 1 million hours of data across 4057 languages, extending the language coverage of SSL models 4-fold.

Representation Learning Self-Supervised Learning +1

On the Evaluation of Speech Foundation Models for Spoken Language Understanding

no code implementations14 Jun 2024 Siddhant Arora, Ankita Pasad, Chung-Ming Chien, Jionghao Han, Roshan Sharma, Jee-weon Jung, Hira Dhamyal, William Chen, Suwon Shon, Hung-Yi Lee, Karen Livescu, Shinji Watanabe

To answer this, we perform an extensive evaluation of multiple supervised and self-supervised SFMs using several evaluation protocols: (i) frozen SFMs with a lightweight prediction head, (ii) frozen SFMs with a complex prediction head, and (iii) fine-tuned SFMs with a lightweight prediction head.

Benchmarking Prediction +3

On the Effects of Heterogeneous Data Sources on Speech-to-Text Foundation Models

no code implementations13 Jun 2024 Jinchuan Tian, Yifan Peng, William Chen, Kwanghee Choi, Karen Livescu, Shinji Watanabe

The Open Whisper-style Speech Model (OWSM) series was introduced to achieve full transparency in building advanced speech-to-text (S2T) foundation models.

Language Modeling Language Modelling +1

ML-SUPERB 2.0: Benchmarking Multilingual Speech Models Across Modeling Constraints, Languages, and Datasets

no code implementations12 Jun 2024 Jiatong Shi, Shih-Heng Wang, William Chen, Martijn Bartelds, Vanya Bannihatti Kumar, Jinchuan Tian, Xuankai Chang, Dan Jurafsky, Karen Livescu, Hung-Yi Lee, Shinji Watanabe

This paper presents ML-SUPERB~2. 0, which is a new benchmark for evaluating pre-trained SSL and supervised speech models across downstream models, fine-tuning setups, and efficient model adaptation approaches.

Automatic Speech Recognition Automatic Speech Recognition (ASR) +4

YODAS: Youtube-Oriented Dataset for Audio and Speech

no code implementations2 Jun 2024 Xinjian Li, Shinnosuke Takamichi, Takaaki Saeki, William Chen, Sayaka Shiota, Shinji Watanabe

In this study, we introduce YODAS (YouTube-Oriented Dataset for Audio and Speech), a large-scale, multilingual dataset comprising currently over 500k hours of speech data in more than 100 languages, sourced from both labeled and unlabeled YouTube speech datasets.

Self-Supervised Learning speech-recognition +1

Vision-Language Models Provide Promptable Representations for Reinforcement Learning

no code implementations5 Feb 2024 William Chen, Oier Mees, Aviral Kumar, Sergey Levine

We find that our policies trained on embeddings from off-the-shelf, general-purpose VLMs outperform equivalent policies trained on generic, non-promptable image embeddings.

Common Sense Reasoning Instruction Following +6

AugSumm: towards generalizable speech summarization using synthetic labels from large language model

1 code implementation10 Jan 2024 Jee-weon Jung, Roshan Sharma, William Chen, Bhiksha Raj, Shinji Watanabe

We tackle this challenge by proposing AugSumm, a method to leverage large language models (LLMs) as a proxy for human annotators to generate augmented summaries for training and evaluation.

Language Modeling Language Modelling +2

Findings of the 2023 ML-SUPERB Challenge: Pre-Training and Evaluation over More Languages and Beyond

no code implementations9 Oct 2023 Jiatong Shi, William Chen, Dan Berrebbi, Hsiu-Hsuan Wang, Wei-Ping Huang, En-Pei Hu, Ho-Lam Chuang, Xuankai Chang, Yuxun Tang, Shang-Wen Li, Abdelrahman Mohamed, Hung-Yi Lee, Shinji Watanabe

The 2023 Multilingual Speech Universal Performance Benchmark (ML-SUPERB) Challenge expands upon the acclaimed SUPERB framework, emphasizing self-supervised models in multilingual speech recognition and language identification.

Language Identification speech-recognition +1

Evaluating Self-Supervised Speech Representations for Indigenous American Languages

no code implementations5 Oct 2023 Chih-Chen Chen, William Chen, Rodolfo Zevallos, John E. Ortega

The application of self-supervision to speech representation learning has garnered significant interest in recent years, due to its scalability to large amounts of unlabeled data.

Representation Learning Speech Representation Learning

Joint Prediction and Denoising for Large-scale Multilingual Self-supervised Learning

no code implementations26 Sep 2023 William Chen, Jiatong Shi, Brian Yan, Dan Berrebbi, Wangyou Zhang, Yifan Peng, Xuankai Chang, Soumi Maiti, Shinji Watanabe

We show that further efficiency can be achieved with a vanilla HuBERT Base model, which can maintain 94% of XLS-R's performance with only 3% of the data, 4 GPUs, and limited trials.

Denoising Self-Supervised Learning

A New Benchmark of Aphasia Speech Recognition and Detection Based on E-Branchformer and Multi-task Learning

2 code implementations19 May 2023 Jiyang Tang, William Chen, Xuankai Chang, Shinji Watanabe, Brian MacWhinney

Our system achieves state-of-the-art speaker-level detection accuracy (97. 3%), and a relative WER reduction of 11% for moderate Aphasia patients.

Multi-Task Learning speech-recognition +1

A Comparative Study on E-Branchformer vs Conformer in Speech Recognition, Translation, and Understanding Tasks

2 code implementations18 May 2023 Yifan Peng, Kwangyoun Kim, Felix Wu, Brian Yan, Siddhant Arora, William Chen, Jiyang Tang, Suwon Shon, Prashant Sridhar, Shinji Watanabe

Conformer, a convolution-augmented Transformer variant, has become the de facto encoder architecture for speech processing due to its superior performance in various tasks, including automatic speech recognition (ASR), speech translation (ST) and spoken language understanding (SLU).

Automatic Speech Recognition Automatic Speech Recognition (ASR) +2

ML-SUPERB: Multilingual Speech Universal PERformance Benchmark

no code implementations18 May 2023 Jiatong Shi, Dan Berrebbi, William Chen, Ho-Lam Chung, En-Pei Hu, Wei Ping Huang, Xuankai Chang, Shang-Wen Li, Abdelrahman Mohamed, Hung-Yi Lee, Shinji Watanabe

Speech processing Universal PERformance Benchmark (SUPERB) is a leaderboard to benchmark the performance of Self-Supervised Learning (SSL) models on various speech processing tasks.

Automatic Speech Recognition Language Identification +3

Improving Massively Multilingual ASR With Auxiliary CTC Objectives

1 code implementation24 Feb 2023 William Chen, Brian Yan, Jiatong Shi, Yifan Peng, Soumi Maiti, Shinji Watanabe

In this paper, we introduce our work on improving performance on FLEURS, a 102-language open ASR benchmark, by conditioning the entire model on language identity (LID).

Automatic Speech Recognition Automatic Speech Recognition (ASR) +1

LaMPP: Language Models as Probabilistic Priors for Perception and Action

1 code implementation3 Feb 2023 Belinda Z. Li, William Chen, Pratyusha Sharma, Jacob Andreas

Language models trained on large text corpora encode rich distributional information about real-world environments and action sequences.

Activity Recognition Decision Making +2

Benchmarking Azerbaijani Neural Machine Translation

no code implementations29 Jul 2022 Chih-Chen Chen, William Chen

Little research has been done on Neural Machine Translation (NMT) for Azerbaijani.

Benchmarking Domain Generalization +4

Genetic Algorithms For Extractive Summarization

no code implementations5 May 2021 William Chen, Kensal Ramos, Kalyan Naidu Mullaguri, Annie S. Wu

Most current work in NLP utilizes deep learning, which requires a lot of training data and computational power.

Deep Learning Extractive Summarization +1

The Kronecker-Weyl equidistribution theorem and geodesics in 3-manifolds

no code implementations17 Dec 2020 Jozsef Beck, William Chen

Given any rectangular polyhedron 3-manifold $P$ tiled with unit cubes, we find infinitely many explicit directions related to cubic algebraic numbers such that all half-infinite geodesics in these directions are uniformly distributed in $P$.

Number Theory 11K38, 37E35

Generalized Method-of-Moments for Rank Aggregation

no code implementations NeurIPS 2013 Hossein Azari Soufiani, William Chen, David C. Parkes, Lirong Xia

In this paper we propose a class of efficient Generalized Method-of-Moments(GMM) algorithms for computing parameters of the Plackett-Luce model, where the data consists of full rankings over alternatives.

Cannot find the paper you are looking for? You can Submit a new open access paper.