no code implementations • 9 Sep 2024 • Junkun Chen, Jilin Mei, Liang Chen, Fangzhou Zhao, Yu Hu
The limited training samples for object detectors commonly result in low accuracy out-of-distribution (OOD) object detection.
no code implementations • 12 Jun 2024 • Peidong Wang, Jian Xue, Jinyu Li, Junkun Chen, Aswin Shanmugam Subramanian
Language-agnostic many-to-one end-to-end speech translation models can convert audio signals from different source languages into text in a target language.
no code implementations • 23 Oct 2023 • Sara Papi, Peidong Wang, Junkun Chen, Jian Xue, Naoyuki Kanda, Jinyu Li, Yashesh Gaur
The growing need for instant spoken language transcription and translation is driven by increased global communication and cross-lingual interactions.
Automatic Speech Recognition Automatic Speech Recognition (ASR) +4
no code implementations • 6 Oct 2023 • Junkun Chen, Jian Xue, Peidong Wang, Jing Pan, Jinyu Li
Simultaneous Speech-to-Text translation serves a critical role in real-time crosslingual communication.
1 code implementation • 14 Sep 2023 • Mu Yang, Naoyuki Kanda, Xiaofei Wang, Junkun Chen, Peidong Wang, Jian Xue, Jinyu Li, Takuya Yoshioka
End-to-end speech translation (ST) for conversation recordings involves several under-explored challenges such as speaker diarization (SD) without accurate word time stamps and handling of overlapping speech in a streaming fashion.
no code implementations • 7 Jul 2023 • Sara Papi, Peidong Wang, Junkun Chen, Jian Xue, Jinyu Li, Yashesh Gaur
In real-world applications, users often require both translations and transcriptions of speech to enhance their comprehension, particularly in streaming scenarios where incremental generation is necessary.
Automatic Speech Recognition Automatic Speech Recognition (ASR) +2
2 code implementations • 7 Nov 2022 • Xiaoran Fan, Chao Pang, Tian Yuan, He Bai, Renjie Zheng, Pengfei Zhu, Shuohuan Wang, Junkun Chen, Zeyu Chen, Liang Huang, Yu Sun, Hua Wu
In this paper, we extend the pretraining method for cross-lingual multi-speaker speech synthesis tasks, including cross-lingual multi-speaker voice cloning and cross-lingual multi-speaker speech editing.
no code implementations • 10 Jun 2022 • Yuanyi Zhong, Haoran Tang, Junkun Chen, Jian Peng, Yu-Xiong Wang
Our insight has implications in improving the downstream robustness of supervised learning.
2 code implementations • NAACL (ACL) 2022 • HUI ZHANG, Tian Yuan, Junkun Chen, Xintong Li, Renjie Zheng, Yuxin Huang, Xiaojie Chen, Enlei Gong, Zeyu Chen, Xiaoguang Hu, dianhai yu, Yanjun Ma, Liang Huang
PaddleSpeech is an open-source all-in-one speech toolkit.
Automatic Speech Recognition (ASR) Environmental Sound Classification +9
no code implementations • 27 Apr 2022 • Guangxu Xun, Mingbo Ma, Yuchen Bian, Xingyu Cai, Jiaji Huang, Renjie Zheng, Junkun Chen, Jiahong Yuan, Kenneth Church, Liang Huang
In simultaneous translation (SimulMT), the most widely used strategy is the wait-k policy thanks to its simplicity and effectiveness in balancing translation quality and latency.
2 code implementations • 18 Mar 2022 • He Bai, Renjie Zheng, Junkun Chen, Xintong Li, Mingbo Ma, Liang Huang
Recently, speech representation learning has improved many speech-related tasks such as speech recognition, speech classification, and speech-to-text translation.
1 code implementation • 16 Feb 2022 • Zhaocheng Zhu, Chence Shi, Zuobai Zhang, Shengchao Liu, Minghao Xu, Xinyu Yuan, Yangtian Zhang, Junkun Chen, Huiyu Cai, Jiarui Lu, Chang Ma, Runcheng Liu, Louis-Pascal Xhonneux, Meng Qu, Jian Tang
However, lacking domain knowledge (e. g., which tasks to work on), standard benchmarks and data preprocessing pipelines are the main obstacles for machine learning researchers to work in this domain.
no code implementations • Findings (ACL) 2021 • Junkun Chen, Mingbo Ma, Renjie Zheng, Liang Huang
Simultaneous speech-to-text translation is widely useful in many scenarios.
no code implementations • 10 Feb 2021 • Renjie Zheng, Junkun Chen, Mingbo Ma, Liang Huang
Recently, representation learning for text and speech has successfully improved many language related tasks.
no code implementations • 22 Oct 2020 • Junkun Chen, Mingbo Ma, Renjie Zheng, Liang Huang
End-to-end Speech-to-text Translation (E2E-ST), which directly translates source language speech to target language text, is widely useful in practice, but traditional cascaded approaches (ASR+MT) often suffer from error propagation in the pipeline.
Automatic Speech Recognition Automatic Speech Recognition (ASR) +4
no code implementations • EMNLP 2021 • Junkun Chen, Renjie Zheng, Atsuhito Kita, Mingbo Ma, Liang Huang
Simultaneous translation is vastly different from full-sentence translation, in the sense that it starts translation before the source sentence ends, with only a few words delay.
2 code implementations • ICLR 2021 • Meng Qu, Junkun Chen, Louis-Pascal Xhonneux, Yoshua Bengio, Jian Tang
Then in the E-step, we select a set of high-quality rules from all generated rules with both the rule generator and reasoning predictor via posterior inference; and in the M-step, the rule generator is updated with the rules selected in the E-step.
no code implementations • 25 Jul 2019 • Lin Zehui, PengFei Liu, Luyao Huang, Junkun Chen, Xipeng Qiu, Xuanjing Huang
Variants dropout methods have been designed for the fully-connected layer, convolutional layer and recurrent layer in neural networks, and shown to be effective to avoid overfitting.
2 code implementations • ICCV 2019 • Xin Wang, Jiawei Wu, Junkun Chen, Lei LI, Yuan-Fang Wang, William Yang Wang
We also introduce two tasks for video-and-language research based on VATEX: (1) Multilingual Video Captioning, aimed at describing a video in various languages with a compact unified captioning model, and (2) Video-guided Machine Translation, to translate a source language description into the target language using the video information as additional spatiotemporal context.
no code implementations • 23 Aug 2018 • Junkun Chen, Kaiyu Chen, Xinchi Chen, Xipeng Qiu, Xuanjing Huang
Designing shared neural architecture plays an important role in multi-task learning.
no code implementations • 22 Apr 2018 • Renjie Zheng, Junkun Chen, Xipeng Qiu
More specifically, all tasks share the same sentence representation and each task can select the task-specific information from the shared sentence representation with attention mechanism.
no code implementations • 25 Feb 2018 • Junkun Chen, Xipeng Qiu, Pengfei Liu, Xuanjing Huang
Specifically, we use a shared meta-network to capture the meta-knowledge of semantic composition and generate the parameters of the task-specific semantic composition models.