Search Results for author: Yixiao Zhang

Found 22 papers, 15 papers with code

Exploiting Structural Consistency of Chest Anatomy for Unsupervised Anomaly Detection in Radiography Images

1 code implementation13 Mar 2024 Tiange Xiang, Yixiao Zhang, Yongyi Lu, Alan Yuille, Chaoyi Zhang, Weidong Cai, Zongwei Zhou

To this end, we propose a Simple Space-Aware Memory Matrix for In-painting and Detecting anomalies from radiography images (abbreviated as SimSID).

Anatomy Image Reconstruction +1

MusicMagus: Zero-Shot Text-to-Music Editing via Diffusion Models

no code implementations9 Feb 2024 Yixiao Zhang, Yukara Ikemiya, Gus Xia, Naoki Murata, Marco Martínez, Wei-Hsiang Liao, Yuki Mitsufuji, Simon Dixon

This paper introduces a novel approach to the editing of music generated by such models, enabling the modification of specific attributes, such as genre, mood and instrument, while maintaining other aspects unchanged.

Music Generation Text-to-Music Generation

The Song Describer Dataset: a Corpus of Audio Captions for Music-and-Language Evaluation

1 code implementation16 Nov 2023 Ilaria Manco, Benno Weck, Seungheon Doh, Minz Won, Yixiao Zhang, Dmitry Bogdanov, Yusong Wu, Ke Chen, Philip Tovstogan, Emmanouil Benetos, Elio Quinton, György Fazekas, Juhan Nam

We introduce the Song Describer dataset (SDD), a new crowdsourced corpus of high-quality audio-caption pairs, designed for the evaluation of music-and-language models.

Music Captioning Music Generation +2

Content-based Controls For Music Large Language Modeling

1 code implementation26 Oct 2023 Liwei Lin, Gus Xia, Junyan Jiang, Yixiao Zhang

We aim to further equip the models with direct and content-based controls on innate music languages such as pitch, chords and drum track.

Language Modelling Music Generation +1

Exploring XAI for the Arts: Explaining Latent Space in Generative Music

1 code implementation10 Aug 2023 Nick Bryan-Kinns, Berker Banar, Corey Ford, Courtney N. Reed, Yixiao Zhang, Simon Colton, Jack Armitage

We increase the explainability of the model by: i) using latent space regularisation to force some specific dimensions of the latent space to map to meaningful musical attributes, ii) providing a user interface feedback loop to allow people to adjust dimensions of the latent space and observe the results of these changes in real-time, iii) providing a visualisation of the musical attributes in the latent space to help people understand and predict the effect of changes to latent space dimensions.

Music Generation

Continual Learning for Abdominal Multi-Organ and Tumor Segmentation

1 code implementation1 Jun 2023 Yixiao Zhang, Xinyi Li, Huimiao Chen, Alan Yuille, Yaoyao Liu, Zongwei Zhou

The ability to dynamically extend a model to new data and classes is critical for multiple organ and tumor segmentation.

Continual Learning Organ Segmentation +2

CLIP-Driven Universal Model for Organ Segmentation and Tumor Detection

2 code implementations ICCV 2023 Jie Liu, Yixiao Zhang, Jie-Neng Chen, Junfei Xiao, Yongyi Lu, Bennett A. Landman, Yixuan Yuan, Alan Yuille, Yucheng Tang, Zongwei Zhou

The proposed model is developed from an assembly of 14 datasets, using a total of 3, 410 CT scans for training and then evaluated on 6, 162 external CT scans from 3 additional datasets.

Organ Segmentation Segmentation +1

Vis2Mus: Exploring Multimodal Representation Mapping for Controllable Music Generation

1 code implementation10 Nov 2022 Runbang Zhang, Yixiao Zhang, Kai Shao, Ying Shan, Gus Xia

In this study, we explore the representation mapping from the domain of visual arts to the domain of music, with which we can use visual arts as an effective handle to control music generation.

Music Generation Representation Learning +1

Learning Hierarchical Metrical Structure Beyond Measures

1 code implementation21 Sep 2022 Junyan Jiang, Daniel Chin, Yixiao Zhang, Gus Xia

In this paper, we explore a data-driven approach to automatically extract hierarchical metrical structures from scores.

Information Retrieval Music Information Retrieval +1

Interpreting Song Lyrics with an Audio-Informed Pre-trained Language Model

1 code implementation24 Aug 2022 Yixiao Zhang, Junyan Jiang, Gus Xia, Simon Dixon

Lyric interpretations can help people understand songs and their lyrics quickly, and can also make it easier to manage, retrieve and discover songs efficiently from the growing mass of music archives.

Language Modelling Retrieval

Fast AdvProp

1 code implementation ICLR 2022 Jieru Mei, Yucheng Han, Yutong Bai, Yixiao Zhang, Yingwei Li, Xianhang Li, Alan Yuille, Cihang Xie

Specifically, our modifications in Fast AdvProp are guided by the hypothesis that disentangled learning with adversarial examples is the key for performance improvements, while other training recipes (e. g., paired clean and adversarial training samples, multi-step adversarial attackers) could be largely simplified.

Data Augmentation object-detection +1

SQUID: Deep Feature In-Painting for Unsupervised Anomaly Detection

2 code implementations CVPR 2023 Tiange Xiang, Yixiao Zhang, Yongyi Lu, Alan L. Yuille, Chaoyi Zhang, Weidong Cai, Zongwei Zhou

Radiography imaging protocols focus on particular body regions, therefore producing images of great similarity and yielding recurrent anatomical structures across patients.

Anatomy Unsupervised Anomaly Detection

PIANOTREE VAE: Structured Representation Learning for Polyphonic Music

2 code implementations17 Aug 2020 Ziyu Wang, Yiyi Zhang, Yixiao Zhang, Junyan Jiang, Ruihan Yang, Junbo Zhao, Gus Xia

The dominant approach for music representation learning involves the deep unsupervised model family variational autoencoder (VAE).

Music Generation Representation Learning

Learning Interpretable Representation for Controllable Polyphonic Music Generation

2 code implementations17 Aug 2020 Ziyu Wang, Dingsu Wang, Yixiao Zhang, Gus Xia

While deep generative models have become the leading methods for algorithmic composition, it remains a challenging problem to control the generation process because the latent variables of most deep-learning models lack good interpretability.

Disentanglement Music Generation +1

When Radiology Report Generation Meets Knowledge Graph

no code implementations19 Feb 2020 Yixiao Zhang, Xiaosong Wang, Ziyue Xu, Qihang Yu, Alan Yuille, Daguang Xu

In addition, we proposed a new evaluation metric for radiology image reporting with the assistance of the same composed graph.

Graph Embedding Image Captioning

C2FNAS: Coarse-to-Fine Neural Architecture Search for 3D Medical Image Segmentation

no code implementations CVPR 2020 Qihang Yu, Dong Yang, Holger Roth, Yutong Bai, Yixiao Zhang, Alan L. Yuille, Daguang Xu

3D convolution neural networks (CNN) have been proved very successful in parsing organs or tumours in 3D medical images, but it remains sophisticated and time-consuming to choose or design proper 3D networks given different task contexts.

Image Segmentation Medical Image Segmentation +3

Cannot find the paper you are looking for? You can Submit a new open access paper.