Search Results for author: Chutong Meng

Found 3 papers, 3 papers with code

GigaST: A 10,000-hour Pseudo Speech Translation Corpus

1 code implementation8 Apr 2022 Rong Ye, Chengqi Zhao, Tom Ko, Chutong Meng, Tao Wang, Mingxuan Wang, Jun Cao

The training set is translated by a strong machine translation system and the test set is translated by human.

Machine Translation Translation

WavCaps: A ChatGPT-Assisted Weakly-Labelled Audio Captioning Dataset for Audio-Language Multimodal Research

3 code implementations30 Mar 2023 Xinhao Mei, Chutong Meng, Haohe Liu, Qiuqiang Kong, Tom Ko, Chengqi Zhao, Mark D. Plumbley, Yuexian Zou, Wenwu Wang

To address this data scarcity issue, we introduce WavCaps, the first large-scale weakly-labelled audio captioning dataset, comprising approximately 400k audio clips with paired captions.

 Ranked #1 on Zero-Shot Environment Sound Classification on ESC-50 (using extra training data)

Audio captioning Event Detection +6

RepCodec: A Speech Representation Codec for Speech Tokenization

1 code implementation31 Aug 2023 Zhichao Huang, Chutong Meng, Tom Ko

To improve the performance of these discrete speech tokens, we present RepCodec, a novel speech representation codec for semantic speech tokenization.

Language Modelling Quantization

Cannot find the paper you are looking for? You can Submit a new open access paper.