no code implementations • ROCLING 2022 • Qiu-Xia Zhang, Te-Yu Chi, Te-Lun Yang, Jyh-Shing Roger Jang
This study uses training and validation data from the “ROCLING 2022 Chinese Health Care Named Entity Recognition Task” for modeling.
no code implementations • 8 Jun 2025 • Xuanjun Chen, I-Ming Lin, Lin Zhang, Haibin Wu, Hung-Yi Lee, Jyh-Shing Roger Jang
Recent attempts at source tracing for codec-based deepfake speech (CodecFake), generated by neural audio codec-based speech generation (CoSG) models, have exhibited suboptimal performance.
no code implementations • 8 Jan 2025 • Te-Lun Yang, Jyi-Shane Liu, Yuen-Hsien Tseng, Jyh-Shing Roger Jang
Using TTQA and TMMLU+ as evaluation datasets, the system employs BGE-M3 for dense vector retrieval to obtain highly relevant search results and BGE-reranker to reorder these results based on query relevance.
no code implementations • 30 Dec 2024 • Tun-Chieh Lou, Chung-Che Wang, Jyh-Shing Roger Jang, Henian Li, Lang Lin, Norman Chang
This paper proposes the use of iterative transfer learning applied to deep learning models for side-channel attacks.
1 code implementation • 28 Oct 2024 • Chih-Hsiang Hsu, Jyh-Shing Roger Jang
Our results demonstrate that existing 3D human pose estimation models can be significantly enhanced through this adjustment process.
no code implementations • 30 Jun 2024 • Chun-Hsiang Wang, Chung-Che Wang, Jun-You Wang, Jyh-Shing Roger Jang, Yen-Hsun Chu
Source-to-distortion ratio, real-time factor, and optimal latency are employed to evaluate the performance.
no code implementations • 7 Jun 2024 • Xuanjun Chen, Jiawei Du, Haibin Wu, Jyh-Shing Roger Jang, Hung-Yi Lee
In this paper, we propose a neural codec-based adversarial sample detection method for ASV.
1 code implementation • 5 Jun 2024 • Xuanjun Chen, Haibin Wu, Jyh-Shing Roger Jang, Hung-Yi Lee
Detecting singing voice deepfakes, or SingFake, involves determining the authenticity and copyright of a singing voice.
1 code implementation • 20 Feb 2024 • Haibin Wu, Huang-Cheng Chou, Kai-Wei Chang, Lucas Goncalves, Jiawei Du, Jyh-Shing Roger Jang, Chi-Chun Lee, Hung-Yi Lee
Speech emotion recognition (SER) is a pivotal technology for human-computer interaction systems.
no code implementations • 27 Nov 2023 • Yu-Chen Lin, Akhilesh Kumar, Norman Chang, Wenliang Zhang, Muhammad Zakir, Rucha Apte, Haiyang He, Chao Wang, Jyh-Shing Roger Jang
We present four main contributions to enhance the performance of Large Language Models (LLMs) in generating domain-specific code: (i) utilizing LLM-based data splitting and data renovation techniques to improve the semantic representation of embeddings' space; (ii) introducing the Chain of Density for Renovation Credibility (CoDRC), driven by LLMs, and the Adaptive Text Renovation (ATR) algorithm for assessing data renovation reliability; (iii) developing the Implicit Knowledge Expansion and Contemplation (IKEC) Prompt technique; and (iv) effectively refactoring existing scripts to generate new and high-quality scripts with LLMs.
1 code implementation • 21 Nov 2023 • Jun-You Wang, Chon-In Leong, Yu-Chen Lin, Li Su, Jyh-Shing Roger Jang
With the use of data augmentation and source separation model, results show that the proposed method achieves a character error rate of less than 18% on a Mandarin polyphonic dataset for lyrics transcription, and a mean absolute error of 0. 071 seconds for lyrics alignment.
1 code implementation • 28 Jul 2023 • Te-Yu Chi, Yu-Meng Tang, Chia-Wen Lu, Qiu-Xia Zhang, Jyh-Shing Roger Jang
To achieve this objective, we propose a novel self-training strategy that uses labels rather than text for training, significantly reducing the model's training time.
no code implementations • 16 Feb 2023 • Chung-Che Wang, Yu-Chun Lin, Yu-Teng Hsu, Jyh-Shing Roger Jang
A siamese network is used to compare the inputs and predict the preference.
2 code implementations • 27 Oct 2022 • Xuanjun Chen, Haibin Wu, Chung-Che Wang, Hung-Yi Lee, Jyh-Shing Roger Jang
This paper proposed an MTDVocaLiST model, which is trained by our proposed multimodal Transformer distillation (MTD) loss.
no code implementations • 3 Oct 2022 • Xuanjun Chen, Haibin Wu, Helen Meng, Hung-Yi Lee, Jyh-Shing Roger Jang
Audio-visual active speaker detection (AVASD) is well-developed, and now is an indispensable front-end for several multi-modal applications.
no code implementations • 31 Mar 2022 • Yen-Lun Liao, Xuanjun Chen, Chung-Che Wang, Jyh-Shing Roger Jang
The countermeasure (CM) model is developed to protect ASV systems from spoof attacks and prevent resulting personal information leakage in Automatic Speaker Verification (ASV) system.
1 code implementation • 4 Dec 2018 • Szu-Yu Chou, Kai-Hsiang Cheng, Jyh-Shing Roger Jang, Yi-Hsuan Yang
In this paper, we introduce a novel attentional similarity module for the problem of few-shot sound recognition.
Sound Audio and Speech Processing
no code implementations • 31 Oct 2017 • Zhe-Cheng Fan, Yen-Lin Lai, Jyh-Shing Roger Jang
Separating two sources from an audio mixture is an important task with many applications.
no code implementations • ROCLINGIJCLCLP 2012 • Wei-jay Huang, Jhih-rou Lin, Ren-Yuan Lyu, Yuang-chin Chiang, Jyh-Shing Roger Jang, Ming-Tat Ko