Search Results for author: Ying Shi

Found 12 papers, 0 papers with code

A Glance is Enough: Extract Target Sentence By Looking at A keyword

no code implementations9 Oct 2023 Ying Shi, Dong Wang, Lantian Li, Jiqing Han

This paper investigates the possibility of extracting a target sentence from multi-talker speech using only a keyword as input.

Sentence

Spot keywords from very noisy and mixed speech

no code implementations28 May 2023 Ying Shi, Dong Wang, Lantian Li, Jiqing Han, Shi Yin

We propose a novel Mix Training (MT) strategy that encourages the model to discover low-energy keywords from noisy and mixed speech.

Data Augmentation Keyword Spotting

Can We Trust Deep Speech Prior?

no code implementations4 Nov 2020 Ying Shi, Haolin Chen, Zhiyuan Tang, Lantian Li, Dong Wang, Jiqing Han

Recently, speech enhancement (SE) based on deep speech prior has attracted much attention, such as the variational auto-encoder with non-negative matrix factorization (VAE-NMF) architecture.

Speech Enhancement

Discriminative Embedding Autoencoder with a Regressor Feedback for Zero-Shot Learning

no code implementations18 Jul 2019 Ying Shi, Wei Wei, Zhiming Zheng

Zero-shot learning (ZSL) aims to recognize the novel object categories using the semantic representation of categories, and the key idea is to explore the knowledge of how the novel class is semantically related to the familiar classes.

Generalized Zero-Shot Learning Object Recognition

Gaussian-Constrained training for speaker verification

no code implementations8 Nov 2018 Lantian Li, Zhiyuan Tang, Ying Shi, Dong Wang

This paper proposes a Gaussian-constrained training approach that (1) discards the parametric classifier, and (2) enforces the distribution of the derived speaker vectors to be Gaussian.

Speaker Verification

Phonetic-attention scoring for deep speaker features in speaker verification

no code implementations8 Nov 2018 Lantian Li, Zhiyuan Tang, Ying Shi, Dong Wang

This score reflects the similarity of the two frames in phonetic content, and is used to weigh the contribution of this frame pair in the utterance-based scoring.

Machine Translation Speaker Verification +1

Deep factorization for speech signal

no code implementations27 Feb 2018 Lantian Li, Dong Wang, Yixiang Chen, Ying Shi, Zhiyuan Tang, Thomas Fang Zheng

Various informative factors mixed in speech signals, leading to great difficulty when decoding any of the factors.

Emotion Recognition Speaker Recognition

Deep Factorization for Speech Signal

no code implementations5 Jun 2017 Dong Wang, Lantian Li, Ying Shi, Yixiang Chen, Zhiyuan Tang

In this paper, we demonstrated that the speaker factor is also a short-time spectral pattern and can be largely identified with just a few frames using a simple deep neural network (DNN).

Emotion Recognition

Phone-aware Neural Language Identification

no code implementations9 May 2017 Zhiyuan Tang, Dong Wang, Yixiang Chen, Ying Shi, Lantian Li

Pure acoustic neural models, particularly the LSTM-RNN model, have shown great potential in language identification (LID).

Language Identification

Memory Visualization for Gated Recurrent Neural Networks in Speech Recognition

no code implementations28 Sep 2016 Zhiyuan Tang, Ying Shi, Dong Wang, Yang Feng, Shiyue Zhang

Recurrent neural networks (RNNs) have shown clear superiority in sequence modeling, particularly the ones with gated units, such as long short-term memory (LSTM) and gated recurrent unit (GRU).

Automatic Speech Recognition Automatic Speech Recognition (ASR) +1

Pursing power in Arabic on-line discussion forums

no code implementations LREC 2012 Marc Tomlinson, David Bracewell, Mary Draper, Zewar Almissour, Ying Shi, Jeremy Bensley

An analysis of our annotations reflects a high-degree of overlap between current theories on power and conflict within a group and the behavior of individuals within the transcripts.

Cannot find the paper you are looking for? You can Submit a new open access paper.