Search Results for author: You Zhang

Found 42 papers, 26 papers with code

Multi-Attribute Multi-Grained Adaptation of Pre-Trained Language Models for Text Understanding from Bayesian Perspective

1 code implementation8 Mar 2025 You Zhang, Jin Wang, Liang-Chih Yu, Dan Xu, Xuejie Zhang

Current neural networks often employ multi-domain-learning or attribute-injecting mechanisms to incorporate non-independent and identically distributed (non-IID) information for text understanding tasks by capturing individual characteristics and the relationships among samples.

Attribute

Audio Visual Segmentation Through Text Embeddings

no code implementations22 Feb 2025 Kyungbok Lee, You Zhang, Zhiyao Duan

Recent works attempt to overcome the challenge of limited data by leveraging the segmentation foundation model, SAM, prompting it with audio to enhance its ability to segment sounding source objects.

Segmentation

A SAM-guided and Match-based Semi-Supervised Segmentation Framework for Medical Imaging

1 code implementation25 Nov 2024 Guoping Xu, Xiaoxue Qian, Hua Chieh Shao, Jax Luo, Weiguo Lu, You Zhang

This study introduces SAMatch, a SAM-guided Match-based framework for semi-supervised medical image segmentation, aimed at improving pseudo label quality in data-scarce scenarios.

Image Segmentation Pseudo Label +3

Emotional Dimension Control in Language Model-Based Text-to-Speech: Spanning a Broad Spectrum of Human Emotions

no code implementations25 Sep 2024 Kun Zhou, You Zhang, Shengkui Zhao, Hao Wang, Zexu Pan, Dianwen Ng, Chong Zhang, Chongjia Ni, Yukun Ma, Trung Hieu Nguyen, Jia Qi Yip, Bin Ma

Current emotional text-to-speech (TTS) systems face challenges in mimicking a broad spectrum of human emotions due to the inherent complexity of emotions and limitations in emotional speech datasets and models.

Attribute Dimensionality Reduction +5

SVDD 2024: The Inaugural Singing Voice Deepfake Detection Challenge

1 code implementation28 Aug 2024 You Zhang, Yongyi Zang, Jiatong Shi, Ryuichi Yamamoto, Tomoki Toda, Zhiyao Duan

With the advancements in singing voice generation and the growing presence of AI singers on media platforms, the inaugural Singing Voice Deepfake Detection (SVDD) Challenge aims to advance research in identifying AI-generated singing voices from authentic singers.

DeepFake Detection Face Swapping +1

A Multi-Stream Fusion Approach with One-Class Learning for Audio-Visual Deepfake Detection

1 code implementation20 Jun 2024 Kyungbok Lee, You Zhang, Zhiyao Duan

Additionally, to ensure the credibility of detection methods, it is beneficial for the model to interpret which cues from the video indicate it is fake.

DeepFake Detection Face Swapping

SVDD Challenge 2024: A Singing Voice Deepfake Detection Challenge Evaluation Plan

1 code implementation8 May 2024 You Zhang, Yongyi Zang, Jiatong Shi, Ryuichi Yamamoto, Jionghao Han, Yuxun Tang, Tomoki Toda, Zhiyao Duan

The rapid advancement of AI-generated singing voices, which now closely mimic natural human singing and align seamlessly with musical scores, has led to heightened concerns for artists and the music industry.

DeepFake Detection Face Swapping

Prior Frequency Guided Diffusion Model for Limited Angle (LA)-CBCT Reconstruction

no code implementations1 Apr 2024 Jiacheng Xie, Hua-Chieh Shao, Yunxiang Li, You Zhang

PFGDM-B, on the other hand, continuously applies the prior CT information condition in every reconstruction step, while with a decaying mechanism, to gradually phase out the reconstruction guidance from the prior CT scans.

SSIM

Personalized LoRA for Human-Centered Text Understanding

1 code implementation10 Mar 2024 You Zhang, Jin Wang, Liang-Chih Yu, Dan Xu, Xuejie Zhang

Effectively and efficiently adapting a pre-trained language model (PLM) for human-centered text understanding (HCTU) is challenging since user tokens are million-level in most personalized applications and do not have concrete explicit semantics.

Language Modeling Language Modelling +1

nnSAM: Plug-and-play Segment Anything Model Improves nnUNet Performance

1 code implementation29 Sep 2023 Yunxiang Li, Bowen Jing, Zihan Li, Jing Wang, You Zhang

To combine the strengths of foundational and domain-specific models, we propose nnSAM, integrating SAM's robust feature extraction with nnUNet's automatic configuration to enhance segmentation accuracy on small datasets.

Few-Shot Learning Heart Segmentation +4

SingFake: Singing Voice Deepfake Detection

1 code implementation14 Sep 2023 Yongyi Zang, You Zhang, Mojtaba Heydari, Zhiyao Duan

These unique properties make singing voice deepfake detection a relevant but significantly different problem from synthetic speech detection.

Face Swapping Singing Voice Synthesis +1

Mitigating Cross-Database Differences for Learning Unified HRTF Representation

2 code implementations27 Jul 2023 Yutong Wen, You Zhang, Zhiyao Duan

We further show that these normalized HRTFs can be used to learn a more unified HRTF representation across databases than the prior art.

SAMScore: A Content Structural Similarity Metric for Image Translation Evaluation

1 code implementation24 May 2023 Yunxiang Li, Meixu Chen, Kai Wang, Jun Ma, Alan C. Bovik, You Zhang

Image translation has wide applications, such as style transfer and modality conversion, usually aiming to generate images having both high degrees of realism and faithfulness.

Semantic Similarity Semantic Textual Similarity +2

Zero-shot Medical Image Translation via Frequency-Guided Diffusion Models

1 code implementation5 Apr 2023 Yunxiang Li, Hua-Chieh Shao, Xiao Liang, Liyuan Chen, RuiQi Li, Steve Jiang, Jing Wang, You Zhang

However, for medical image translation, the existing diffusion models are deficient in accurately retaining structural information since the structure details of source domain images are lost during the forward diffusion process and cannot be fully recovered through learned reverse diffusion, while the integrity of anatomical structures is extremely important in medical images.

Anatomy SSIM +2

ChatDoctor: A Medical Chat Model Fine-Tuned on a Large Language Model Meta-AI (LLaMA) Using Medical Domain Knowledge

1 code implementation24 Mar 2023 Yunxiang Li, Zihan Li, Kai Zhang, Ruilong Dan, Steve Jiang, You Zhang

The primary aim of this research was to address the limitations observed in the medical knowledge of prevalent large language models (LLMs) such as ChatGPT, by creating a specialized language model with enhanced accuracy in medical advice.

Information Retrieval Language Modeling +5

SAMO: Speaker Attractor Multi-Center One-Class Learning for Voice Anti-Spoofing

2 code implementations4 Nov 2022 Siwen Ding, You Zhang, Zhiyao Duan

Our previous research on one-class learning has improved the generalization ability to unseen attacks by compacting the bona fide speech in the embedding space.

Diversity Speaker Verification +2

HRTF Field: Unifying Measured HRTF Magnitude Representation with Neural Fields

2 code implementations27 Oct 2022 You Zhang, Yuxiang Wang, Zhiyao Duan

In this work, we propose to use neural fields, a differentiable representation of functions through neural networks, to model HRTFs with arbitrary spatial sampling schemes.

Recurrence-free Survival Prediction under the Guidance of Automatic Gross Tumor Volume Segmentation for Head and Neck Cancers

1 code implementation22 Sep 2022 Kai Wang, Yunxiang Li, Michael Dohopolski, Tao Peng, Weiguo Lu, You Zhang, Jing Wang

For Head and Neck Cancers (HNC) patient management, automatic gross tumor volume (GTV) segmentation and accurate pre-treatment cancer recurrence prediction are of great importance to assist physicians in designing personalized management plans, which have the potential to improve the treatment outcome and quality of life for HNC patients.

Management Prediction +3

Predicting Global HRTFs From Scanned Head Geometry Using Deep Learning and Compact Representations

1 code implementation28 Jul 2022 Yuxiang Wang, You Zhang, Zhiyao Duan, Mark Bocko

For the HRTF data, we use truncated spherical harmonic (SH) coefficients to represent the HRTF magnitudes and onsets.

Rethinking Audio-visual Synchronization for Active Speaker Detection

no code implementations21 Jun 2022 Abudukelimu Wuerkaixi, You Zhang, Zhiyao Duan, ChangShui Zhang

This clarification of definition is motivated by our extensive experiments, through which we discover that existing ASD methods fail in modeling the audio-visual synchronization and often classify unsynchronized videos as active speaking.

Active Speaker Detection Audio-Visual Synchronization +1

Plug-and-play Shape Refinement Framework for Multi-site and Lifespan Brain Skull Stripping

no code implementations8 Mar 2022 Yunxiang Li, Ruilong Dan, Shuai Wang, Yifan Cao, Xiangde Luo, Chenghao Tan, Gangyong Jia, Huiyu Zhou, You Zhang, Yaqi Wang, Li Wang

For instance, the model trained on a dataset with specific imaging parameters cannot be well applied to other datasets with different imaging parameters.

Skull Stripping Source-Free Domain Adaptation

A Probabilistic Fusion Framework for Spoofing Aware Speaker Verification

1 code implementation10 Feb 2022 You Zhang, Ge Zhu, Zhiyao Duan

We further propose fusion strategies for direct inference and fine-tuning to predict the SASV score based on the framework.

Speaker Verification

UR Channel-Robust Synthetic Speech Detection System for ASVspoof 2021

2 code implementations26 Jul 2021 Xinhui Chen, You Zhang, Ge Zhu, Zhiyao Duan

Different from previous ASVspoof challenges, the LA task this year presents codec and transmission channel variability, while the new task DF presents general audio compression.

Audio Compression Face Swapping +3

An Empirical Study on Channel Effects for Synthetic Voice Spoofing Countermeasure Systems

3 code implementations3 Apr 2021 You Zhang, Ge Zhu, Fei Jiang, Zhiyao Duan

Spoofing countermeasure (CM) systems are critical in speaker verification; they aim to discern spoofing attacks from bona fide speech trials.

Data Augmentation Multi-Task Learning +2

One-class learning towards generalized voice spoofing detection

3 code implementations27 Oct 2020 You Zhang, Fei Jiang, Zhiyao Duan

Human voices can be used to authenticate the identity of the speaker, but the automatic speaker verification (ASV) systems are vulnerable to voice spoofing attacks, such as impersonation, replay, text-to-speech, and voice conversion.

Speaker Verification Text to Speech +2

Visualizing Deep Learning-based Radio Modulation Classifier

no code implementations3 May 2020 Liang Huang, You Zhang, Weijian Pan, Jinyin Chen, Li Ping Qian, Yuan Wu

Extensive numerical results show both the CNN-based classifier and LSTM-based classifier extract similar radio features relating to modulation reference points.

Deep Learning General Classification

Data Augmentation for Deep Learning-based Radio Modulation Classification

no code implementations6 Dec 2019 Liang Huang, Weijian Pan, You Zhang, LiPing Qian, Nan Gao, Yuan Wu

Deep learning has recently been applied to automatically classify the modulation categories of received radio signals without manual experience.

Classification Data Augmentation +2

Synthetic Defocus and Look-Ahead Autofocus for Casual Videography

no code implementations15 May 2019 Xuaner Zhang, Kevin Matzen, Vivien Nguyen, Dillon Yao, You Zhang, Ren Ng

We present a system that synthetically renders refocusable video from a deep DOF video shot with a smartphone, and analyzes future video frames to deliver context-aware autofocus for the current frame.

BIG-bench Machine Learning Saliency Detection

The normalized Laplacian spectra of subdivision vertex-edge neighbourhood vertex(edge)-corona for graphs

no code implementations26 Jun 2018 Fei Wen, You Zhang, Wei Wang

Whereafter, the normalized Laplacian spectra of $G_1^S\bowtie (G_2^V\cup G_3^E)$ and $G_1^S\diamondsuit(G_2^V\cup G_3^E)$ are respectively determined in terms of the corresponding normalized Laplacian spectra of the connected regular graphs $G_{1}$, $G_{2}$ and $G_{3}$, which extend the corresponding results of [A. Das, P. Panigrahi, Linear Multil.

Combinatorics

Constructing multi-modality and multi-classifier radiomics predictive models through reliable classifier fusion

no code implementations4 Oct 2017 Zhiguo Zhou, Zhi-Jie Zhou, Hongxia Hao, Shulong Li, Xi Chen, You Zhang, Michael Folkert, Jing Wang

First, the predictive performance of the model may be reduced when features extracted from an individual imaging modality are blindly combined into a single predictive model.

YNU-HPCC at EmoInt-2017: Using a CNN-LSTM Model for Sentiment Intensity Prediction

no code implementations WS 2017 You Zhang, Hang Yuan, Jin Wang, Xue-jie Zhang

In this paper, we present a system that uses a convolutional neural network with long short-term memory (CNN-LSTM) model to complete the task.

Sentiment Analysis

Cannot find the paper you are looking for? You can Submit a new open access paper.