Search Results for author: Xiaohuan Zhou

Found 10 papers, 7 papers with code

AIR-Bench: Benchmarking Large Audio-Language Models via Generative Comprehension

no code implementations12 Feb 2024 Qian Yang, Jin Xu, Wenrui Liu, Yunfei Chu, Ziyue Jiang, Xiaohuan Zhou, Yichong Leng, YuanJun Lv, Zhou Zhao, Chang Zhou, Jingren Zhou

By revealing the limitations of existing LALMs through evaluation results, AIR-Bench can provide insights into the direction of future research.

2k Automatic Speech Recognition +4

ONE-PEACE: Exploring One General Representation Model Toward Unlimited Modalities

2 code implementations18 May 2023 Peng Wang, Shijie Wang, Junyang Lin, Shuai Bai, Xiaohuan Zhou, Jingren Zhou, Xinggang Wang, Chang Zhou

In this work, we explore a scalable way for building a general representation model toward unlimited modalities.

 Ranked #1 on Semantic Segmentation on ADE20K (using extra training data)

Action Classification AudioCaps +16

OFASys: A Multi-Modal Multi-Task Learning System for Building Generalist Models

1 code implementation8 Dec 2022 Jinze Bai, Rui Men, Hao Yang, Xuancheng Ren, Kai Dang, Yichang Zhang, Xiaohuan Zhou, Peng Wang, Sinan Tan, An Yang, Zeyu Cui, Yu Han, Shuai Bai, Wenbin Ge, Jianxin Ma, Junyang Lin, Jingren Zhou, Chang Zhou

As a starting point, we provide presets of 7 different modalities and 23 highly-diverse example tasks in OFASys, with which we also develop a first-in-kind, single model, OFA+, that can handle text, image, speech, video, and motion data.

Multi-Task Learning

MMSpeech: Multi-modal Multi-task Encoder-Decoder Pre-training for Speech Recognition

1 code implementation29 Nov 2022 Xiaohuan Zhou, JiaMing Wang, Zeyu Cui, Shiliang Zhang, Zhijie Yan, Jingren Zhou, Chang Zhou

Therefore, we propose to introduce the phoneme modality into pre-training, which can help capture modality-invariant information between Mandarin speech and text.

Automatic Speech Recognition Automatic Speech Recognition (ASR) +2

Contextual Expressive Text-to-Speech

no code implementations26 Nov 2022 Jianhong Tu, Zeyu Cui, Xiaohuan Zhou, Siqi Zheng, Kai Hu, Ju Fan, Chang Zhou

To achieve this task, we construct a synthetic dataset and develop an effective framework.

Speech Synthesis

Speech2Slot: An End-to-End Knowledge-based Slot Filling from Speech

no code implementations10 May 2021 Pengwei Wang, Xin Ye, Xiaohuan Zhou, Jinghui Xie, Hao Wang

In contrast to conventional pipeline Spoken Language Understanding (SLU) which consists of automatic speech recognition (ASR) and natural language understanding (NLU), end-to-end SLU infers the semantic meaning directly from speech and overcomes the error propagation caused by ASR.

Automatic Speech Recognition Automatic Speech Recognition (ASR) +8

xDeepFM: Combining Explicit and Implicit Feature Interactions for Recommender Systems

19 code implementations14 Mar 2018 Jianxun Lian, Xiaohuan Zhou, Fuzheng Zhang, Zhongxia Chen, Xing Xie, Guangzhong Sun

On one hand, the xDeepFM is able to learn certain bounded-degree feature interactions explicitly; on the other hand, it can learn arbitrary low- and high-order feature interactions implicitly.

Click-Through Rate Prediction Recommendation Systems

Cannot find the paper you are looking for? You can Submit a new open access paper.