Search Results for author: Jinhua Liang

Found 13 papers, 5 papers with code

From Aesthetics to Human Preferences: Comparative Perspectives of Evaluating Text-to-Music Systems

no code implementations30 Apr 2025 huan zhang, Jinhua Liang, Huy Phan, Wenwu Wang, Emmanouil Benetos

In this paper, we use music generation as a case study to investigate the gap between automatic evaluation metrics and human preferences.

Music Generation

Hierarchical Symbolic Pop Music Generation with Graph Neural Networks

no code implementations12 Sep 2024 Wen Qing Lim, Jinhua Liang, huan zhang

Music is inherently made up of complex structures, and representing them as graphs helps to capture multiple levels of relationships.

Music Generation Rhythm

Bridging Paintings and Music -- Exploring Emotion based Music Generation through Paintings

no code implementations12 Sep 2024 Tanisha Hisariya, huan zhang, Jinhua Liang

Rapid advancements in artificial intelligence have significantly enhanced generative tasks involving music and images, employing both unimodal and multimodal approaches.

FAD Image Captioning +2

From Audio Encoders to Piano Judges: Benchmarking Performance Understanding for Solo Piano

no code implementations5 Jul 2024 huan zhang, Jinhua Liang, Simon Dixon

Our study investigates an approach for understanding musical performances through the lens of audio encoding models, focusing on the domain of solo Western classical piano music.

Attribute Benchmarking

DExter: Learning and Controlling Performance Expression with Diffusion Models

1 code implementation21 Jun 2024 huan zhang, Shreyan Chowdhury, Carlos Eduardo Cancino-Chacón, Jinhua Liang, Simon Dixon, Gerhard Widmer

The perceptual-feature-conditioned generation and transferring capabilities of DExter are verified by a proxy model predicting perceptual characteristics of differently steered performances.

Music Performance Rendering

Mind the Domain Gap: a Systematic Analysis on Bioacoustic Sound Event Detection

2 code implementations27 Mar 2024 Jinhua Liang, Ines Nolasco, Burooj Ghani, Huy Phan, Emmanouil Benetos, Dan Stowell

A recent development in the field is the introduction of the task known as few-shot bioacoustic sound event detection, which aims to train a versatile animal sound detector using only a small set of audio samples.

Data Augmentation Domain Adaptation +3

WavCraft: Audio Editing and Generation with Large Language Models

1 code implementation14 Mar 2024 Jinhua Liang, huan zhang, Haohe Liu, Yin Cao, Qiuqiang Kong, Xubo Liu, Wenwu Wang, Mark D. Plumbley, Huy Phan, Emmanouil Benetos

We introduce WavCraft, a collective system that leverages large language models (LLMs) to connect diverse task-specific models for audio content creation and editing.

In-Context Learning

Acoustic Prompt Tuning: Empowering Large Language Models with Audition Capabilities

2 code implementations30 Nov 2023 Jinhua Liang, Xubo Liu, Wenwu Wang, Mark D. Plumbley, Huy Phan, Emmanouil Benetos

In this work, we introduce Acoustic Prompt Tuning (APT), a new adapter extending LLMs and VLMs to the audio domain by injecting audio embeddings to the input of LLMs, namely soft prompting.

Audio Classification Few-Shot Audio Classification +3

WavJourney: Compositional Audio Creation with Large Language Models

1 code implementation26 Jul 2023 Xubo Liu, Zhongkai Zhu, Haohe Liu, Yi Yuan, Meng Cui, Qiushi Huang, Jinhua Liang, Yin Cao, Qiuqiang Kong, Mark D. Plumbley, Wenwu Wang

Subjective evaluations demonstrate the potential of WavJourney in crafting engaging storytelling audio content from text.

Audio Generation

Adapting Language-Audio Models as Few-Shot Audio Learners

no code implementations28 May 2023 Jinhua Liang, Xubo Liu, Haohe Liu, Huy Phan, Emmanouil Benetos, Mark D. Plumbley, Wenwu Wang

We presented the Treff adapter, a training-efficient adapter for CLAP, to boost zero-shot classification performance by making use of a small set of labelled data.

Audio Classification Few-Shot Learning +1

Channel Compression: Rethinking Information Redundancy among Channels in CNN Architecture

no code implementations2 Jul 2020 Jinhua Liang, Tao Zhang, Guoqing Feng

Aiming at channel compression, a novel convolutional construction named compact convolution is proposed to embrace the progress in spatial convolution, channel grouping and pooling operation.

Acoustic Scene Classification Event Detection +5

Cannot find the paper you are looking for? You can Submit a new open access paper.