1 code implementation • 8 Mar 2025 • You Zhang, Jin Wang, Liang-Chih Yu, Dan Xu, Xuejie Zhang
Current neural networks often employ multi-domain-learning or attribute-injecting mechanisms to incorporate non-independent and identically distributed (non-IID) information for text understanding tasks by capturing individual characteristics and the relationships among samples.
no code implementations • 22 Feb 2025 • Kyungbok Lee, You Zhang, Zhiyao Duan
Recent works attempt to overcome the challenge of limited data by leveraging the segmentation foundation model, SAM, prompting it with audio to enhance its ability to segment sounding source objects.
no code implementations • 13 Feb 2025 • Xin Wang, Héctor Delgado, Hemlata Tak, Jee-weon Jung, Hye-jin Shim, Massimiliano Todisco, Ivan Kukanov, Xuechen Liu, Md Sahidullah, Tomi Kinnunen, Nicholas Evans, Kong Aik Lee, Junichi Yamagishi, Myeonghun Jeong, Ge Zhu, Yongyi Zang, You Zhang, Soumi Maiti, Florian Lux, Nicolas Müller, Wangyou Zhang, Chengzhe Sun, Shuwei Hou, Siwei Lyu, Sébastien Le Maguer, Cheng Gong, Hanjie Guo, Liping Chen, Vishwanath Singh
The database contains attacks generated with 32 different algorithms, also crowdsourced, and optimised to varying degrees using new surrogate detection models.
1 code implementation • 25 Nov 2024 • Guoping Xu, Xiaoxue Qian, Hua Chieh Shao, Jax Luo, Weiguo Lu, You Zhang
This study introduces SAMatch, a SAM-guided Match-based framework for semi-supervised medical image segmentation, aimed at improving pseudo label quality in data-scarce scenarios.
no code implementations • 25 Sep 2024 • Kun Zhou, You Zhang, Shengkui Zhao, Hao Wang, Zexu Pan, Dianwen Ng, Chong Zhang, Chongjia Ni, Yukun Ma, Trung Hieu Nguyen, Jia Qi Yip, Bin Ma
Current emotional text-to-speech (TTS) systems face challenges in mimicking a broad spectrum of human emotions due to the inherent complexity of emotions and limitations in emotional speech datasets and models.
1 code implementation • 28 Aug 2024 • You Zhang, Yongyi Zang, Jiatong Shi, Ryuichi Yamamoto, Tomoki Toda, Zhiyao Duan
With the advancements in singing voice generation and the growing presence of AI singers on media platforms, the inaugural Singing Voice Deepfake Detection (SVDD) Challenge aims to advance research in identifying AI-generated singing voices from authentic singers.
1 code implementation • 20 Jun 2024 • Kyungbok Lee, You Zhang, Zhiyao Duan
Additionally, to ensure the credibility of detection methods, it is beneficial for the model to interpret which cues from the video indicate it is fake.
2 code implementations • 4 Jun 2024 • Yongyi Zang, Jiatong Shi, You Zhang, Ryuichi Yamamoto, Jionghao Han, Yuxun Tang, Shengyuan Xu, Wenxiao Zhao, Jing Guo, Tomoki Toda, Zhiyao Duan
Addressing these gaps, we introduce CtrSVDD, a large-scale, diverse collection of bonafide and deepfake singing vocals.
1 code implementation • 8 May 2024 • You Zhang, Yongyi Zang, Jiatong Shi, Ryuichi Yamamoto, Jionghao Han, Yuxun Tang, Tomoki Toda, Zhiyao Duan
The rapid advancement of AI-generated singing voices, which now closely mimic natural human singing and align seamlessly with musical scores, has led to heightened concerns for artists and the music industry.
no code implementations • 1 Apr 2024 • Jiacheng Xie, Hua-Chieh Shao, Yunxiang Li, You Zhang
PFGDM-B, on the other hand, continuously applies the prior CT information condition in every reconstruction step, while with a decaying mechanism, to gradually phase out the reconstruction guidance from the prior CT scans.
1 code implementation • 10 Mar 2024 • You Zhang, Jin Wang, Liang-Chih Yu, Dan Xu, Xuejie Zhang
Effectively and efficiently adapting a pre-trained language model (PLM) for human-centered text understanding (HCTU) is challenging since user tokens are million-level in most personalized applications and do not have concrete explicit semantics.
1 code implementation • 19 Dec 2023 • Yuyang Xia, Shuncheng Liu, Quanlin Yu, Liwei Deng, You Zhang, Han Su, Kai Zheng
Autonomous driving is an emerging technology that has advanced rapidly over the last decade.
1 code implementation • 24 Nov 2023 • Enting Zhou, You Zhang, Zhiyao Duan
In this work, we propose to learn the AV representation from categorical emotion labels of speech.
no code implementations • 19 Nov 2023 • Yunxiang Li, Hua-Chieh Shao, Xiaoxue Qian, You Zhang
This anatomical information then guides a subsequent diffusion model to generate high-quality CT images.
1 code implementation • 29 Sep 2023 • Yunxiang Li, Bowen Jing, Zihan Li, Jing Wang, You Zhang
To combine the strengths of foundational and domain-specific models, we propose nnSAM, integrating SAM's robust feature extraction with nnUNet's automatic configuration to enhance segmentation accuracy on small datasets.
1 code implementation • 14 Sep 2023 • Yongyi Zang, You Zhang, Mojtaba Heydari, Zhiyao Duan
These unique properties make singing voice deepfake detection a relevant but significantly different problem from synthetic speech detection.
2 code implementations • 27 Jul 2023 • Yutong Wen, You Zhang, Zhiyao Duan
We further show that these normalized HRTFs can be used to learn a more unified HRTF representation across databases than the prior art.
1 code implementation • 24 May 2023 • Yunxiang Li, Meixu Chen, Kai Wang, Jun Ma, Alan C. Bovik, You Zhang
Image translation has wide applications, such as style transfer and modality conversion, usually aiming to generate images having both high degrees of realism and faithfulness.
1 code implementation • 5 Apr 2023 • Yunxiang Li, Hua-Chieh Shao, Xiao Liang, Liyuan Chen, RuiQi Li, Steve Jiang, Jing Wang, You Zhang
However, for medical image translation, the existing diffusion models are deficient in accurately retaining structural information since the structure details of source domain images are lost during the forward diffusion process and cannot be fully recovered through learned reverse diffusion, while the integrity of anatomical structures is extremely important in medical images.
1 code implementation • 24 Mar 2023 • Yunxiang Li, Zihan Li, Kai Zhang, Ruilong Dan, Steve Jiang, You Zhang
The primary aim of this research was to address the limitations observed in the medical knowledge of prevalent large language models (LLMs) such as ChatGPT, by creating a specialized language model with enhanced accuracy in medical advice.
2 code implementations • 4 Nov 2022 • Siwen Ding, You Zhang, Zhiyao Duan
Our previous research on one-class learning has improved the generalization ability to unseen attacks by compacting the bona fide speech in the embedding space.
2 code implementations • 27 Oct 2022 • You Zhang, Yuxiang Wang, Zhiyao Duan
In this work, we propose to use neural fields, a differentiable representation of functions through neural networks, to model HRTFs with arbitrary spatial sampling schemes.
1 code implementation • 22 Sep 2022 • Kai Wang, Yunxiang Li, Michael Dohopolski, Tao Peng, Weiguo Lu, You Zhang, Jing Wang
For Head and Neck Cancers (HNC) patient management, automatic gross tumor volume (GTV) segmentation and accurate pre-treatment cancer recurrence prediction are of great importance to assist physicians in designing personalized management plans, which have the potential to improve the treatment outcome and quality of life for HNC patients.
1 code implementation • 28 Jul 2022 • Yuxiang Wang, You Zhang, Zhiyao Duan, Mark Bocko
For the HRTF data, we use truncated spherical harmonic (SH) coefficients to represent the HRTF magnitudes and onsets.
1 code implementation • 29 Jun 2022 • Zihan Li, Yunxiang Li, Qingde Li, Puyang Wang, Dazhou Guo, Le Lu, Dakai Jin, You Zhang, Qingqi Hong
In our LViT model, medical text annotation is incorporated to compensate for the quality deficiency in image data.
Ranked #3 on
Medical Image Segmentation
on MoNuSeg
no code implementations • 21 Jun 2022 • Abudukelimu Wuerkaixi, You Zhang, Zhiyao Duan, ChangShui Zhang
This clarification of definition is motivated by our extensive experiments, through which we discover that existing ASD methods fail in modeling the audio-visual synchronization and often classify unsynchronized videos as active speaking.
no code implementations • 8 Mar 2022 • Yunxiang Li, Ruilong Dan, Shuai Wang, Yifan Cao, Xiangde Luo, Chenghao Tan, Gangyong Jia, Huiyu Zhou, You Zhang, Yaqi Wang, Li Wang
For instance, the model trained on a dataset with specific imaging parameters cannot be well applied to other datasets with different imaging parameters.
1 code implementation • 10 Feb 2022 • You Zhang, Ge Zhu, Zhiyao Duan
We further propose fusion strategies for direct inference and fine-tuning to predict the SASV score based on the framework.
2 code implementations • 26 Jul 2021 • Xinhui Chen, You Zhang, Ge Zhu, Zhiyao Duan
Different from previous ASVspoof challenges, the LA task this year presents codec and transmission channel variability, while the new task DF presents general audio compression.
no code implementations • 23 Apr 2021 • Jaehee Chun, Justin C. Park, Sven Olberg, You Zhang, Dan Nguyen, Jing Wang, Jin Sung Kim, Steve Jiang
Finally, in the sCT reconstruction task, the MAE is reduced from 68 to 22 HU by utilizing the IDOL framework.
3 code implementations • 3 Apr 2021 • You Zhang, Ge Zhu, Fei Jiang, Zhiyao Duan
Spoofing countermeasure (CM) systems are critical in speaker verification; they aim to discern spoofing attacks from bona fide speech trials.
3 code implementations • 27 Oct 2020 • You Zhang, Fei Jiang, Zhiyao Duan
Human voices can be used to authenticate the identity of the speaker, but the automatic speaker verification (ASV) systems are vulnerable to voice spoofing attacks, such as impersonation, replay, text-to-speech, and voice conversion.
1 code implementation • 8 Aug 2020 • Sefik Emre Eskimez, You Zhang, Zhiyao Duan
Visual emotion expression plays an important role in audiovisual speech communication.
no code implementations • 3 May 2020 • Liang Huang, You Zhang, Weijian Pan, Jinyin Chen, Li Ping Qian, Yuan Wu
Extensive numerical results show both the CNN-based classifier and LSTM-based classifier extract similar radio features relating to modulation reference points.
no code implementations • 6 Dec 2019 • Liang Huang, Weijian Pan, You Zhang, LiPing Qian, Nan Gao, Yuan Wu
Deep learning has recently been applied to automatically classify the modulation categories of received radio signals without manual experience.
no code implementations • 15 May 2019 • Xuaner Zhang, Kevin Matzen, Vivien Nguyen, Dillon Yao, You Zhang, Ren Ng
We present a system that synthetically renders refocusable video from a deep DOF video shot with a smartphone, and analyzes future video frames to deliver context-aware autofocus for the current frame.
no code implementations • 26 Jun 2018 • Fei Wen, You Zhang, Wei Wang
Whereafter, the normalized Laplacian spectra of $G_1^S\bowtie (G_2^V\cup G_3^E)$ and $G_1^S\diamondsuit(G_2^V\cup G_3^E)$ are respectively determined in terms of the corresponding normalized Laplacian spectra of the connected regular graphs $G_{1}$, $G_{2}$ and $G_{3}$, which extend the corresponding results of [A. Das, P. Panigrahi, Linear Multil.
Combinatorics
no code implementations • SEMEVAL 2018 • You Zhang, Jin Wang, Xue-jie Zhang
The useful BiLSTM (Bidirectional Long-Short Term Memory) model with attention mechanism was mainly applied for our system.
no code implementations • IJCNLP 2017 • Hang Yuan, You Zhang, Jin Wang, Xue-jie Zhang
A shared task is a typical question answering task that aims to test how accurately the participants can answer the questions in exams.
no code implementations • 4 Oct 2017 • Zhiguo Zhou, Zhi-Jie Zhou, Hongxia Hao, Shulong Li, Xi Chen, You Zhang, Michael Folkert, Jing Wang
First, the predictive performance of the model may be reduced when features extracted from an individual imaging modality are blindly combined into a single predictive model.
no code implementations • WS 2017 • You Zhang, Hang Yuan, Jin Wang, Xue-jie Zhang
In this paper, we present a system that uses a convolutional neural network with long short-term memory (CNN-LSTM) model to complete the task.