Search Results for author: Wei Xue

Found 39 papers, 14 papers with code

FastSAG: Towards Fast Non-Autoregressive Singing Accompaniment Generation

no code implementations13 May 2024 Jianyi Chen, Wei Xue, Xu Tan, Zhen Ye, Qifeng Liu, Yike Guo

By intensive experimental studies, we demonstrate that the proposed method can generate better samples than SingSong, and accelerate the generation by at least 30 times.

ComposerX: Multi-Agent Symbolic Music Composition with LLMs

1 code implementation28 Apr 2024 Qixin Deng, Qikai Yang, Ruibin Yuan, Yipeng Huang, Yi Wang, Xubo Liu, Zeyue Tian, Jiahao Pan, Ge Zhang, Hanfeng Lin, Yizhi Li, Yinghao Ma, Jie Fu, Chenghua Lin, Emmanouil Benetos, Wenwu Wang, Guangyu Xia, Wei Xue, Yike Guo

Music composition represents the creative side of humanity, and itself is a complex task that requires abilities to understand and generate information with long dependency and harmony constraints.

In-Context Learning Music Generation

FlashSpeech: Efficient Zero-Shot Speech Synthesis

1 code implementation23 Apr 2024 Zhen Ye, Zeqian Ju, Haohe Liu, Xu Tan, Jianyi Chen, Yiwen Lu, Peiwen Sun, Jiahao Pan, Weizhen Bian, Shulin He, Qifeng Liu, Yike Guo, Wei Xue

The generation processes of FlashSpeech can be achieved efficiently with one or two sampling steps while maintaining high audio quality and high similarity to the audio prompt for zero-shot speech generation.

Speech Synthesis Voice Conversion

RQ-RAG: Learning to Refine Queries for Retrieval Augmented Generation

no code implementations31 Mar 2024 Chi-Min Chan, Chunpu Xu, Ruibin Yuan, Hongyin Luo, Wei Xue, Yike Guo, Jie Fu

To this end, we propose learning to Refine Query for Retrieval Augmented Generation (RQ-RAG) in this paper, endeavoring to enhance the model by equipping it with capabilities for explicit rewriting, decomposition, and disambiguation.

In-Context Learning Response Generation +1

Ad Recommendation in a Collapsed and Entangled World

no code implementations22 Feb 2024 Junwei Pan, Wei Xue, Ximei Wang, Haibin Yu, Xun Liu, Shijie Quan, Xueming Qiu, Dapeng Liu, Lei Xiao, Jie Jiang

In this paper, we present an industry ad recommendation system, paying attention to the challenges and practices of learning appropriate representations.

Feature Correlation Model Optimization

RJUA-MedDQA: A Multimodal Benchmark for Medical Document Question Answering and Clinical Reasoning

no code implementations19 Feb 2024 Congyun Jin, Ming Zhang, Xiaowei Ma, Li Yujiao, Yingbo Wang, Yabo Jia, Yuliang Du, Tao Sun, Haowen Wang, Cong Fan, Jinjie Gu, Chenfei Chi, Xiangguo Lv, Fangzhou Li, Wei Xue, Yiran Huang

Recent advancements in Large Language Models (LLMs) and Large Multi-modal Models (LMMs) have shown potential in various medical applications, such as Intelligent Medical Diagnosis.

document understanding Medical Diagnosis +1

CoMoSVC: Consistency Model-based Singing Voice Conversion

no code implementations3 Jan 2024 Yiwen Lu, Zhen Ye, Wei Xue, Xu Tan, Qifeng Liu, Yike Guo

The diffusion-based Singing Voice Conversion (SVC) methods have achieved remarkable performances, producing natural audios with high similarity to the target timbre.

Voice Conversion

FM-OV3D: Foundation Model-based Cross-modal Knowledge Blending for Open-Vocabulary 3D Detection

no code implementations22 Dec 2023 Dongmei Zhang, Chang Li, Ray Zhang, Shenghao Xie, Wei Xue, Xiaodong Xie, Shanghang Zhang

In this work, we propose FM-OV3D, a method of Foundation Model-based Cross-modal Knowledge Blending for Open-Vocabulary 3D Detection, which improves the open-vocabulary localization and recognition abilities of 3D model by blending knowledge from multiple pre-trained foundation models, achieving true open-vocabulary without facing constraints from original 3D datasets.

3D Object Detection 3D Open-Vocabulary Object Detection +2

RJUA-QA: A Comprehensive QA Dataset for Urology

1 code implementation15 Dec 2023 Shiwei Lyu, Chenfei Chi, Hongbo Cai, Lei Shi, Xiaoyan Yang, Lei Liu, Xiang Chen, Deng Zhao, Zhiqiang Zhang, Xianguo Lyu, Ming Zhang, Fangzhou Li, Xiaowei Ma, Yue Shen, Jinjie Gu, Wei Xue, Yiran Huang

We introduce RJUA-QA, a novel medical dataset for question answering (QA) and reasoning with clinical evidence, contributing to bridge the gap between general large language models (LLMs) and medical-specific LLM applications.

Question Answering

Weakly-Supervised Emotion Transition Learning for Diverse 3D Co-speech Gesture Generation

no code implementations29 Nov 2023 Xingqun Qi, Jiahao Pan, Peng Li, Ruibin Yuan, Xiaowei Chi, Mengfei Li, Wenhan Luo, Wei Xue, Shanghang Zhang, Qifeng Liu, Yike Guo

In addition, the lack of large-scale available datasets with emotional transition speech and corresponding 3D human gestures also limits the addressing of this task.

Audio inpainting Gesture Generation

Continual Learning with Dirichlet Generative-based Rehearsal

no code implementations13 Sep 2023 Min Zeng, Wei Xue, Qifeng Liu, Yike Guo

Recent advancements in data-driven task-oriented dialogue systems (ToDs) struggle with incremental learning due to computational constraints and time-consuming issues.

Continual Learning Incremental Learning +7

ChatEval: Towards Better LLM-based Evaluators through Multi-Agent Debate

1 code implementation14 Aug 2023 Chi-Min Chan, Weize Chen, Yusheng Su, Jianxuan Yu, Wei Xue, Shanghang Zhang, Jie Fu, Zhiyuan Liu

Text evaluation has historically posed significant challenges, often demanding substantial labor and time cost.

Text Generation

LyricWhiz: Robust Multilingual Zero-shot Lyrics Transcription by Whispering to ChatGPT

1 code implementation29 Jun 2023 Le Zhuo, Ruibin Yuan, Jiahao Pan, Yinghao Ma, Yizhi Li, Ge Zhang, Si Liu, Roger Dannenberg, Jie Fu, Chenghua Lin, Emmanouil Benetos, Wenhu Chen, Wei Xue, Yike Guo

We introduce LyricWhiz, a robust, multilingual, and zero-shot automatic lyrics transcription method achieving state-of-the-art performance on various lyrics transcription datasets, even in challenging genres such as rock and metal.

Automatic Lyrics Transcription Language Modelling +3

ViDA: Homeostatic Visual Domain Adapter for Continual Test Time Adaptation

1 code implementation7 Jun 2023 Jiaming Liu, Senqiao Yang, Peidong Jia, Renrui Zhang, Ming Lu, Yandong Guo, Wei Xue, Shanghang Zhang

Note that, our method can be regarded as a novel transfer paradigm for large-scale models, delivering promising results in adaptation to continually changing distributions.

Test-time Adaptation

NAS-FM: Neural Architecture Search for Tunable and Interpretable Sound Synthesis based on Frequency Modulation

no code implementations22 May 2023 Zhen Ye, Wei Xue, Xu Tan, Qifeng Liu, Yike Guo

Since expert knowledge is hard to acquire, it hinders the flexibility to quickly design and tune digital synthesizers for diverse sounds.

Neural Architecture Search

Taxonomy Completion with Probabilistic Scorer via Box Embedding

1 code implementation18 May 2023 Wei Xue, Yongliang Shen, Wenqi Ren, Jietian Guo, ShiLiang Pu, Weiming Lu

Specifically, TaxBox consists of three components: (1) a graph aggregation module to leverage the structural information of the taxonomy and two lightweight decoders that map features to box embedding and capture complex relationships between concepts; (2) two probabilistic scorers that correspond to attachment and insertion operations and ensure the avoidance of pseudo-leaves; and (3) three learning objectives that assist the model in mapping concepts more granularly onto the box embedding space.

CoMoSpeech: One-Step Speech and Singing Voice Synthesis via Consistency Model

1 code implementation11 May 2023 Zhen Ye, Wei Xue, Xu Tan, Jie Chen, Qifeng Liu, Yike Guo

In this paper, we propose a "Co"nsistency "Mo"del-based "Speech" synthesis method, CoMoSpeech, which achieve speech synthesis through a single diffusion sampling step while achieving high audio quality.

Denoising Singing Voice Synthesis +1

RECIST Weakly Supervised Lesion Segmentation via Label-Space Co-Training

no code implementations1 Mar 2023 Lianyu Zhou, Dong Wei, Donghuan Lu, Wei Xue, Liansheng Wang, Yefeng Zheng

As an essential indicator for cancer progression and treatment response, tumor size is often measured following the response evaluation criteria in solid tumors (RECIST) guideline in CT slices.

Lesion Segmentation Weakly supervised segmentation

Pathway to Future Symbiotic Creativity

no code implementations18 Aug 2022 Yike Guo, Qifeng Liu, Jie Chen, Wei Xue, Jie Fu, Henrik Jensen, Fernando Rosas, Jeffrey Shaw, Xing Wu, Jiji Zhang, Jianliang Xu

This report presents a comprehensive view of our vision on the development path of the human-machine symbiotic art creation.


Efficient Climate Simulation via Machine Learning Method

no code implementations15 Aug 2022 Xin Wang, Wei Xue, Yilun Han, Guangwen Yang

We develop a user-friendly platform NeuroGCM for efficiently developing hybrid modeling in climate simulation.

Transferable Physical Attack against Object Detection with Separable Attention

no code implementations19 May 2022 Yu Zhang, Zhiqiang Gong, Yichuang Zhang, YongQian Li, Kangcheng Bin, Jiahao Qi, Wei Xue, Ping Zhong

Transferable adversarial attack is always in the spotlight since deep learning models have been demonstrated to be vulnerable to adversarial samples.

Adversarial Attack object-detection +1

Few-shot Object Detection with Self-adaptive Attention Network for Remote Sensing Images

no code implementations26 Sep 2020 Zixuan Xiao, Wei Xue, Ping Zhong

Particularly, in order to fit the object detection settings, our proposed few-shot detector concentrates on the relations that lie in the level of objects instead of the full image with the assistance of Self-Adaptive Attention Network (SAAN).

Few-Shot Object Detection Object +1

Neural Kalman Filtering for Speech Enhancement

no code implementations28 Jul 2020 Wei Xue, Gang Quan, Chao Zhang, Guohong Ding, Xiaodong He, BoWen Zhou

Statistical signal processing based speech enhancement methods adopt expert knowledge to design the statistical models and linear filters, which is complementary to the deep neural network (DNN) based methods which are data-driven.

Automatic Speech Recognition Automatic Speech Recognition (ASR) +2

Re-examining the Solar Axion Explanation for the XENON1T Excess

no code implementations25 Jun 2020 Christina Gao, Jia Liu, Lian-Tao Wang, Xiao-Ping Wang, Wei Xue, Yi-Ming Zhong

Meanwhile, they can also scatter with the atoms through the inverse Primakoff process via the axion-photon coupling, which emits a photon and mimics the electronic recoil signals.

High Energy Physics - Phenomenology High Energy Physics - Experiment

Aspect Based Sentiment Analysis with Gated Convolutional Networks

1 code implementation ACL 2018 Wei Xue, Tao Li

Aspect based sentiment analysis (ABSA) can provide more detailed information than general sentiment analysis, because it aims to predict the sentiment polarities of the given aspects or entities in text.

Aspect-Based Sentiment Analysis Aspect Category Sentiment Analysis

Serendipity in dark photon searches

1 code implementation15 Jan 2018 Philip Ilten, Yotam Soreq, Mike Williams, Wei Xue

Searches for dark photons provide serendipitous discovery potential for other types of vector particles.

High Energy Physics - Phenomenology High Energy Physics - Experiment

MTNA: A Neural Multi-task Model for Aspect Category Classification and Aspect Term Extraction On Restaurant Reviews

no code implementations IJCNLP 2017 Wei Xue, Wubai Zhou, Tao Li, Qing Wang

Online reviews are valuable resources not only for consumers to make decisions before purchase, but also for providers to get feedbacks for their services or commodities.

Aspect-Based Sentiment Analysis Extract Aspect +5

PBODL : Parallel Bayesian Online Deep Learning for Click-Through Rate Prediction in Tencent Advertising System

no code implementations4 Jul 2017 Xun Liu, Wei Xue, Lei Xiao, Bo Zhang

Then we extend the model family to a variety of bayesian online models with increasing feature embedding capabilities, such as Sparse-MLP, FM-MLP and FFM-MLP.

Click-Through Rate Prediction

Cannot find the paper you are looking for? You can Submit a new open access paper.