nnSAM: Plug-and-play Segment Anything Model Improves nnUNet Performance

1 code implementation29 Sep 2023 Yunxiang Li, Bowen Jing, Zihan Li, Jing Wang, You Zhang

The recent developments of foundation models in computer vision, especially the Segment Anything Model (SAM), allow scalable and domain-agnostic image segmentation to serve as a general-purpose segmentation tool.

Few-Shot Learning Image Segmentation +2

SingFake: Singing Voice Deepfake Detection

no code implementations14 Sep 2023 Yongyi Zang, You Zhang, Mojtaba Heydari, Zhiyao Duan

These unique properties make singing voice deepfake detection a relevant but significantly different problem from synthetic speech detection.

Face Swapping Singing Voice Synthesis +1

Mitigating Cross-Database Differences for Learning Unified HRTF Representation

1 code implementation27 Jul 2023 Yutong Wen, You Zhang, Zhiyao Duan

We further show that these normalized HRTFs can be used to learn a more unified HRTF representation across databases than the prior art.

SAMScore: A Semantic Structural Similarity Metric for Image Translation Evaluation

1 code implementation24 May 2023 Yunxiang Li, Meixu Chen, Wenxuan Yang, Kai Wang, Jun Ma, Alan C. Bovik, You Zhang

Image translation has wide applications, such as style transfer and modality conversion, usually aiming to generate images having both high degrees of realism and faithfulness.

Semantic Similarity Semantic Textual Similarity +2

Zero-shot Medical Image Translation via Frequency-Guided Diffusion Models

no code implementations5 Apr 2023 Yunxiang Li, Hua-Chieh Shao, Xiao Liang, Liyuan Chen, RuiQi Li, Steve Jiang, Jing Wang, You Zhang

However, for medical image translation, the existing diffusion models are deficient in accurately retaining structural information since the structure details of source domain images are lost during the forward diffusion process and cannot be fully recovered through learned reverse diffusion, while the integrity of anatomical structures is extremely important in medical images.

Anatomy Translation +1

ChatDoctor: A Medical Chat Model Fine-Tuned on a Large Language Model Meta-AI (LLaMA) Using Medical Domain Knowledge

1 code implementation24 Mar 2023 Yunxiang Li, Zihan Li, Kai Zhang, Ruilong Dan, Steve Jiang, You Zhang

The primary aim of this research was to address the limitations observed in the medical knowledge of prevalent large language models (LLMs) such as ChatGPT, by creating a specialized language model with enhanced accuracy in medical advice.

Information Retrieval Language Modelling +3

SAMO: Speaker Attractor Multi-Center One-Class Learning for Voice Anti-Spoofing

1 code implementation4 Nov 2022 Siwen Ding, You Zhang, Zhiyao Duan

Our previous research on one-class learning has improved the generalization ability to unseen attacks by compacting the bona fide speech in the embedding space.

Speaker Verification Speech Synthesis +1

HRTF Field: Unifying Measured HRTF Magnitude Representation with Neural Fields

1 code implementation27 Oct 2022 You Zhang, Yuxiang Wang, Zhiyao Duan

In this work, we propose to use neural fields, a differentiable representation of functions through neural networks, to model HRTFs with arbitrary spatial sampling schemes.

Recurrence-free Survival Prediction under the Guidance of Automatic Gross Tumor Volume Segmentation for Head and Neck Cancers

1 code implementation22 Sep 2022 Kai Wang, Yunxiang Li, Michael Dohopolski, Tao Peng, Weiguo Lu, You Zhang, Jing Wang

For Head and Neck Cancers (HNC) patient management, automatic gross tumor volume (GTV) segmentation and accurate pre-treatment cancer recurrence prediction are of great importance to assist physicians in designing personalized management plans, which have the potential to improve the treatment outcome and quality of life for HNC patients.

Management Survival Prediction +1

Predicting Global Head-Related Transfer Functions From Scanned Head Geometry Using Deep Learning and Compact Representations

1 code implementation28 Jul 2022 Yuxiang Wang, You Zhang, Zhiyao Duan, Mark Bocko

For the HRTF data, we use truncated spherical harmonic (SH) coefficients to represent the HRTF magnitudes and onsets.

Rethinking Audio-visual Synchronization for Active Speaker Detection

no code implementations21 Jun 2022 Abudukelimu Wuerkaixi, You Zhang, Zhiyao Duan, ChangShui Zhang

This clarification of definition is motivated by our extensive experiments, through which we discover that existing ASD methods fail in modeling the audio-visual synchronization and often classify unsynchronized videos as active speaking.

Audio-Visual Synchronization Contrastive Learning

Plug-and-play Shape Refinement Framework for Multi-site and Lifespan Brain Skull Stripping

no code implementations8 Mar 2022 Yunxiang Li, Ruilong Dan, Shuai Wang, Yifan Cao, Xiangde Luo, Chenghao Tan, Gangyong Jia, Huiyu Zhou, You Zhang, Yaqi Wang, Li Wang

For instance, the model trained on a dataset with specific imaging parameters cannot be well applied to other datasets with different imaging parameters.

Skull Stripping Source-Free Domain Adaptation

A Probabilistic Fusion Framework for Spoofing Aware Speaker Verification

1 code implementation10 Feb 2022 You Zhang, Ge Zhu, Zhiyao Duan

We further propose fusion strategies for direct inference and fine-tuning to predict the SASV score based on the framework.

Speaker Verification

UR Channel-Robust Synthetic Speech Detection System for ASVspoof 2021

2 code implementations26 Jul 2021 Xinhui Chen, You Zhang, Ge Zhu, Zhiyao Duan

Different from previous ASVspoof challenges, the LA task this year presents codec and transmission channel variability, while the new task DF presents general audio compression.

Face Swapping Synthetic Speech Detection +1

An Empirical Study on Channel Effects for Synthetic Voice Spoofing Countermeasure Systems

3 code implementations3 Apr 2021 You Zhang, Ge Zhu, Fei Jiang, Zhiyao Duan

Spoofing countermeasure (CM) systems are critical in speaker verification; they aim to discern spoofing attacks from bona fide speech trials.

Data Augmentation Multi-Task Learning +2

One-class learning towards generalized voice spoofing detection

3 code implementations27 Oct 2020 You Zhang, Fei Jiang, Zhiyao Duan

Human voices can be used to authenticate the identity of the speaker, but the automatic speaker verification (ASV) systems are vulnerable to voice spoofing attacks, such as impersonation, replay, text-to-speech, and voice conversion.

Speaker Verification Voice Anti-spoofing +1

Visualizing Deep Learning-based Radio Modulation Classifier

no code implementations3 May 2020 Liang Huang, You Zhang, Weijian Pan, Jinyin Chen, Li Ping Qian, Yuan Wu

Extensive numerical results show both the CNN-based classifier and LSTM-based classifier extract similar radio features relating to modulation reference points.

General Classification

Data Augmentation for Deep Learning-based Radio Modulation Classification

no code implementations6 Dec 2019 Liang Huang, Weijian Pan, You Zhang, LiPing Qian, Nan Gao, Yuan Wu

Deep learning has recently been applied to automatically classify the modulation categories of received radio signals without manual experience.

Classification Data Augmentation +1

Synthetic Defocus and Look-Ahead Autofocus for Casual Videography

no code implementations15 May 2019 Xuaner Zhang, Kevin Matzen, Vivien Nguyen, Dillon Yao, You Zhang, Ren Ng

We present a system that synthetically renders refocusable video from a deep DOF video shot with a smartphone, and analyzes future video frames to deliver context-aware autofocus for the current frame.

BIG-bench Machine Learning Saliency Detection

The normalized Laplacian spectra of subdivision vertex-edge neighbourhood vertex(edge)-corona for graphs

no code implementations26 Jun 2018 Fei Wen, You Zhang, Wei Wang

Whereafter, the normalized Laplacian spectra of $G_1^S\bowtie (G_2^V\cup G_3^E)$ and $G_1^S\diamondsuit(G_2^V\cup G_3^E)$ are respectively determined in terms of the corresponding normalized Laplacian spectra of the connected regular graphs $G_{1}$, $G_{2}$ and $G_{3}$, which extend the corresponding results of [A. Das, P. Panigrahi, Linear Multil.


Constructing multi-modality and multi-classifier radiomics predictive models through reliable classifier fusion

no code implementations4 Oct 2017 Zhiguo Zhou, Zhi-Jie Zhou, Hongxia Hao, Shulong Li, Xi Chen, You Zhang, Michael Folkert, Jing Wang

First, the predictive performance of the model may be reduced when features extracted from an individual imaging modality are blindly combined into a single predictive model.

YNU-HPCC at EmoInt-2017: Using a CNN-LSTM Model for Sentiment Intensity Prediction

no code implementations WS 2017 You Zhang, Hang Yuan, Jin Wang, Xue-jie Zhang

In this paper, we present a system that uses a convolutional neural network with long short-term memory (CNN-LSTM) model to complete the task.

Sentiment Analysis

