no code implementations • 26 Mar 2024 • Liangchen Li, Jiajun He
Interestingly, DMs can also invert an input image to noise by moving backward along the PF ODE, a key operation for downstream tasks such as interpolation and image editing.
no code implementations • 28 Jan 2024 • Wenxin Xiong, Jiajun He, Zhang-Lei Shi, Keyuan Hu, Hing Cheung So, Chi-Sing Leung
This short communication addresses the problem of elliptic localization with outlier measurements, whose occurrences are prevalent in various location-enabled applications and can significantly compromise the positioning performance if not adequately handled.
no code implementations • 24 Jan 2024 • Jiajun He, Xiaohan Shi, Xingfeng Li, Tomoki Toda
Therefore, in this paper, we incorporate two auxiliary tasks, ASR error detection (AED) and ASR error correction (AEC), to enhance the semantic coherence of ASR text, and further introduce a novel multi-modal fusion (MF) method to learn shared representations across modalities.
Automatic Speech Recognition Automatic Speech Recognition (ASR) +2
no code implementations • 13 Nov 2023 • Xiaohan Shi, Jiajun He, Xingfeng Li, Tomoki Toda
This paper proposes an efficient attempt to noisy speech emotion recognition (NSER).
Automatic Speech Recognition Automatic Speech Recognition (ASR) +3
no code implementations • 8 Oct 2023 • Jiajun He, Zekun Yang, Tomoki Toda
Automatic speech recognition (ASR) systems often encounter difficulties in accurately recognizing rare words, leading to errors that can have a negative impact on downstream tasks such as keyword spotting, intent detection, and text summarization.
Automatic Speech Recognition Automatic Speech Recognition (ASR) +4
1 code implementation • 29 Sep 2023 • Jiajun He, Gergely Flamich, Zongyu Guo, José Miguel Hernández-Lobato
COMpression with Bayesian Implicit NEural Representations (COMBINER) is a recent data compression method that addresses a key inefficiency of previous Implicit Neural Representation (INR)-based approaches: it avoids quantization and enables direct optimization of the rate-distortion performance.
1 code implementation • NeurIPS 2023 • Zongyu Guo, Gergely Flamich, Jiajun He, Zhibo Chen, José Miguel Hernández-Lobato
Many common types of data can be represented as functions that map coordinates to signal values, such as pixel locations to RGB values in the case of an image.
no code implementations • 14 Aug 2021 • Xiaobo Jiang, Kun He, Jiajun He, Guangyu Yan
Entity extraction is a key technology for obtaining information from massive texts in natural language processing.