Search Results for author: Qiushi Huang

Found 13 papers, 11 papers with code

Sequential Prediction of Social Media Popularity with Deep Temporal Context Networks

1 code implementation12 Dec 2017 Bo Wu, Wen-Huang Cheng, Yongdong Zhang, Qiushi Huang, Jintao Li, Tao Mei

With a joint embedding network, we obtain a unified deep representation of multi-modal user-post data in a common embedding space.

Social Media Popularity Prediction

Conditional Sound Generation Using Neural Discrete Time-Frequency Representation Learning

1 code implementation21 Jul 2021 Xubo Liu, Turab Iqbal, Jinzheng Zhao, Qiushi Huang, Mark D. Plumbley, Wenwu Wang

We evaluate our approach on the UrbanSound8K dataset, compared to SampleRNN, with the performance metrics measuring the quality and diversity of generated sounds.

Music Generation Representation Learning +1

CL4AC: A Contrastive Loss for Audio Captioning

2 code implementations21 Jul 2021 Xubo Liu, Qiushi Huang, Xinhao Mei, Tom Ko, H Lilian Tang, Mark D. Plumbley, Wenwu Wang

Automated Audio captioning (AAC) is a cross-modal translation task that aims to use natural language to describe the content of an audio clip.

Audio captioning Translation

Audio Captioning Transformer

1 code implementation21 Jul 2021 Xinhao Mei, Xubo Liu, Qiushi Huang, Mark D. Plumbley, Wenwu Wang

In this paper, we propose an Audio Captioning Transformer (ACT), which is a full Transformer network based on an encoder-decoder architecture and is totally convolution-free.

AudioCaps Audio captioning

Separate What You Describe: Language-Queried Audio Source Separation

1 code implementation28 Mar 2022 Xubo Liu, Haohe Liu, Qiuqiang Kong, Xinhao Mei, Jinzheng Zhao, Qiushi Huang, Mark D. Plumbley, Wenwu Wang

In this paper, we introduce the task of language-queried audio source separation (LASS), which aims to separate a target source from an audio mixture based on a natural language query of the target source (e. g., "a man tells a joke followed by people laughing").

AudioCaps Audio Source Separation

Personalized Dialogue Generation with Persona-Adaptive Attention

1 code implementation27 Oct 2022 Qiushi Huang, Yu Zhang, Tom Ko, Xubo Liu, Bo Wu, Wenwu Wang, Lilian Tang

Persona-based dialogue systems aim to generate consistent responses based on historical context and predefined persona.

Dialogue Generation

WavJourney: Compositional Audio Creation with Large Language Models

1 code implementation26 Jul 2023 Xubo Liu, Zhongkai Zhu, Haohe Liu, Yi Yuan, Meng Cui, Qiushi Huang, Jinhua Liang, Yin Cao, Qiuqiang Kong, Mark D. Plumbley, Wenwu Wang

Subjective evaluations demonstrate the potential of WavJourney in crafting engaging storytelling audio content from text.

Audio Generation

Retrieval-Augmented Text-to-Audio Generation

no code implementations14 Sep 2023 Yi Yuan, Haohe Liu, Xubo Liu, Qiushi Huang, Mark D. Plumbley, Wenwu Wang

Despite recent progress in text-to-audio (TTA) generation, we show that the state-of-the-art models, such as AudioLDM, trained on datasets with an imbalanced class distribution, such as AudioCaps, are biased in their generation performance.

AudioCaps Audio Generation +2

KICGPT: Large Language Model with Knowledge in Context for Knowledge Graph Completion

1 code implementation4 Feb 2024 Yanbin Wei, Qiushi Huang, James T. Kwok, Yu Zhang

Knowledge Graph Completion (KGC) is crucial for addressing knowledge graph incompleteness and supporting downstream applications.

In-Context Learning Language Modelling +1

Cannot find the paper you are looking for? You can Submit a new open access paper.