no code implementations • 8 Nov 2024 • Chien-yu Huang, Wei-Chih Chen, Shu-wen Yang, Andy T. Liu, Chen-An Li, Yu-Xiang Lin, Wei-Cheng Tseng, Anuj Diwan, Yi-Jen Shih, Jiatong Shi, William Chen, Xuanjun Chen, Chi-Yuan Hsiao, Puyuan Peng, Shih-Heng Wang, Chun-Yi Kuan, Ke-Han Lu, Kai-Wei Chang, Chih-Kai Yang, Fabian Ritter-Gutierrez, Ming To Chuang, Kuan-Po Huang, Siddhant Arora, You-Kuan Lin, Eunjung Yeo, Kalvin Chang, Chung-Ming Chien, Kwanghee Choi, Cheng-Hsiu Hsieh, Yi-Cheng Lin, Chee-En Yu, I-Hsiang Chiu, Heitor R. Guimarães, Jionghao Han, Tzu-Quan Lin, Tzu-Yuan Lin, Homu Chang, Ting-Wu Chang, Chun Wei Chen, Shou-Jen Chen, Yu-Hua Chen, Hsi-Chun Cheng, Kunal Dhawan, Jia-Lin Fang, Shi-Xin Fang, Kuan-Yu Fang Chiang, Chi An Fu, Hsien-Fu Hsiao, Ching Yu Hsu, Shao-Syuan Huang, Lee Chen Wei, Hsi-Che Lin, Hsuan-Hao Lin, Hsuan-Ting Lin, Jian-Ren Lin, Ting-Chun Liu, Li-Chun Lu, Tsung-Min Pai, Ankita Pasad, Shih-Yun Shan Kuan, Suwon Shon, Yuxun Tang, Yun-Shao Tsai, Jui-Chiang Wei, Tzu-Chieh Wei, Chengxi Wu, Dien-Ruei Wu, Chao-Han Huck Yang, Chieh-Chi Yang, Jia Qi Yip, Shao-Xiang Yuan, Vahid Noroozi, Zhehuai Chen, Haibin Wu, Karen Livescu, David Harwath, Shinji Watanabe, Hung-Yi Lee
We present Dynamic-SUPERB Phase-2, an open and evolving benchmark for the comprehensive evaluation of instruction-based universal speech models.
no code implementations • 10 Feb 2023 • Pedro Sarmento, Adarsh Kumar, Yu-Hua Chen, CJ Carr, Zack Zukowski, Mathieu Barthet
We trained a BERT model for downstream genre classification and used it to assess the results obtained with the genre-CTRL model.
1 code implementation • 16 Jun 2021 • Ching-Yu Chiu, Joann Ching, Wen-Yi Hsiao, Yu-Hua Chen, Alvin Wen-Yu Su, Yi-Hsuan Yang
Due to advances in deep learning, the performance of automatic beat and downbeat tracking in musical audio signals has seen great improvement in recent years.
no code implementations • 4 Aug 2020 • Yu-Hua Chen, Yu-Hsiang Huang, Wen-Yi Hsiao, Yi-Hsuan Yang
Deep learning algorithms are increasingly developed for learning to compose music in the form of MIDI files.
Sound Audio and Speech Processing
no code implementations • 28 Jun 2020 • Rui Gong, Dengxin Dai, Yu-Hua Chen, Wen Li, Luc van Gool
AIT achieves this zero-shot image translation capability by coupling a supervised training scheme in the synthetic domain, a cycle consistency strategy in the real domain, an adversarial training scheme between the two domains, and a novel network design.
1 code implementation • 18 May 2020 • Jen-Yu Liu, Yu-Hua Chen, Yin-Cheng Yeh, Yi-Hsuan Yang
Audio examples, as well as the code for implementing our model, will be publicly available online upon paper publication.
1 code implementation • CVPR 2021 • Jinhong Deng, Wen Li, Yu-Hua Chen, Lixin Duan
We reveal that there often exists a considerable model bias for the simple mean teacher (MT) model in cross-domain scenarios, and eliminate the model bias with several simple yet highly effective strategies.
1 code implementation • 26 Dec 2019 • Jen-Yu Liu, Yu-Hua Chen, Yin-Cheng Yeh, Yi-Hsuan Yang
Generative models for singing voice have been mostly concerned with the task of ``singing voice synthesis,'' i. e., to produce singing voice waveforms given musical scores and text lyrics.
no code implementations • 15 Dec 2019 • Suman Saha, Wen-Hao Xu, Menelaos Kanakis, Stamatios Georgoulis, Yu-Hua Chen, Danda Pani Paudel, Luc van Gool
Face anti-spoofing is a measure towards this direction for bio-metric user authentication, and in particular face recognition, that tries to prevent spoof attacks.
no code implementations • 10 Jul 2019 • Jiancong Wang, Yu-Hua Chen, Yifan Wu, Jianbo Shi, James Gee
Single image super-resolution (SISR) reconstruction for magnetic resonance imaging (MRI) has generated significant interest because of its potential to not only speed up imaging but to improve quantitative processing and analysis of available image data.
no code implementations • 6 Mar 2019 • Chao-Chun Hsu, Yu-Hua Chen, Zi-Yuan Chen, Hsin-Yu Lin, Ting-Hao 'Kenneth' Huang, Lun-Wei Ku
In this paper, we introduce Dixit, an interactive visual storytelling system that the user interacts with iteratively to compose a short story for a photo sequence.
1 code implementation • CVPR 2019 • Rui Gong, Wen Li, Yu-Hua Chen, Luc van Gool
In this work, we present a domain flow generation(DLOW) model to bridge two different domains by generating a continuous sequence of intermediate domains flowing from one domain to the other.
no code implementations • 1 Mar 2018 • Sergi Caelles, Alberto Montes, Kevis-Kokitsi Maninis, Yu-Hua Chen, Luc van Gool, Federico Perazzi, Jordi Pont-Tuset
Motivated by the analysis of the results of the 2017 edition, the main track of the competition will be the same than in the previous edition (segmentation given the full mask of the objects in the first frame -- semi-supervised scenario).
no code implementations • 20 Feb 2018 • Siming Yan, Feng Shi, Yu-Hua Chen, Damini Dey, Sang-Eun Lee, Hyuk-Jae Chang, Debiao Li, Yibin Xie
Coronary calcium causes beam hardening and blooming artifacts on cardiac computed tomography angiography (CTA) images, which lead to overestimation of lumen stenosis and reduction of diagnostic specificity.
no code implementations • 18 Sep 2017 • Kevis-Kokitsi Maninis, Sergi Caelles, Yu-Hua Chen, Jordi Pont-Tuset, Laura Leal-Taixé, Daniel Cremers, Luc van Gool
Video Object Segmentation, and video processing in general, has been historically dominated by methods that rely on the temporal consistency and redundancy in consecutive video frames.
Ranked #47 on
Semi-Supervised Video Object Segmentation
on DAVIS 2016
no code implementations • 6 Apr 2017 • Sergi Caelles, Yu-Hua Chen, Jordi Pont-Tuset, Luc van Gool
This paper tackles the problem of semi-supervised video object segmentation, that is, segmenting an object in a sequence given its mask in the first frame.