Search Results for author: Huaming Wang

Found 12 papers, 2 papers with code

Learning to mask: Towards generalized face forgery detection

no code implementations29 Dec 2022 Jianwei Fei, Yunshu Dai, Huaming Wang, Zhihua Xia

Our goal is to reduce the features that are easy to learn in the training phase, so as to reduce the risk of overfitting on specific forgery types.

Data Augmentation

Describing emotions with acoustic property prompts for speech emotion recognition

no code implementations14 Nov 2022 Hira Dhamyal, Benjamin Elizalde, Soham Deshmukh, Huaming Wang, Bhiksha Raj, Rita Singh

We investigate how the model can learn to associate the audio with the descriptions, resulting in performance improvement of Speech Emotion Recognition and Speech Audio Retrieval.

Retrieval Speech Emotion Recognition

Real-Time Joint Personalized Speech Enhancement and Acoustic Echo Cancellation with E3Net

no code implementations4 Nov 2022 Sefik Emre Eskimez, Takuya Yoshioka, Alex Ju, Min Tang, Tanel Parnamaa, Huaming Wang

We dedicate the early layers to the AEC task while encouraging later layers for personalization by adding a bypass connection from the early layers to the mask prediction layer.

Acoustic echo cancellation Multi-Task Learning +1

Audio Retrieval with WavText5K and CLAP Training

1 code implementation28 Sep 2022 Soham Deshmukh, Benjamin Elizalde, Huaming Wang

In this work, we propose a new collection of web audio-text pairs and a new framework for retrieval.

Audio captioning Contrastive Learning +2

Fast Real-time Personalized Speech Enhancement: End-to-End Enhancement Network (E3Net) and Knowledge Distillation

no code implementations2 Apr 2022 Manthan Thakker, Sefik Emre Eskimez, Takuya Yoshioka, Huaming Wang

Our results show that E3Net provides better speech and transcription quality with a lower target speaker over-suppression (TSOS) rate than the baseline model.

Automatic Speech Recognition Knowledge Distillation +3

One model to enhance them all: array geometry agnostic multi-channel personalized speech enhancement

no code implementations20 Oct 2021 Hassan Taherian, Sefik Emre Eskimez, Takuya Yoshioka, Huaming Wang, Zhuo Chen, Xuedong Huang

Experimental results show that the proposed geometry agnostic model outperforms the model trained on a specific microphone array geometry in both speech quality and automatic speech recognition accuracy.

Automatic Speech Recognition Speech Enhancement +1

Personalized Speech Enhancement: New Models and Comprehensive Evaluation

no code implementations18 Oct 2021 Sefik Emre Eskimez, Takuya Yoshioka, Huaming Wang, Xiaofei Wang, Zhuo Chen, Xuedong Huang

Our results show that the proposed models can yield better speech recognition accuracy, speech intelligibility, and perceptual quality than the baseline models, and the multi-task training can alleviate the TSOS issue in addition to improving the speech recognition accuracy.

Speech Enhancement speech-recognition +1

Cannot find the paper you are looking for? You can Submit a new open access paper.