Search Results for author: Ron Hoory

Found 11 papers, 1 papers with code

Creating an African American-Sounding TTS: Guidelines, Technical Challenges,and Surprising Evaluations

no code implementations • 17 Mar 2024 • Claudio Pinhanez, Raul Fernandez, Marcelo Grave, Julio Nogima, Ron Hoory

Representations of AI agents in user interfaces and robotics are predominantly White, not only in terms of facial and skin features, but also in the synthetic voices they use.

Attribute

Paper
Add Code

Speak While You Think: Streaming Speech Synthesis During Text Generation

no code implementations • 20 Sep 2023 • Avihu Dekel, Slava Shechtman, Raul Fernandez, David Haws, Zvi Kons, Ron Hoory

Experimental results show that LLM2Speech maintains the teacher's quality while reducing the latency to enable natural conversations.

Speech Synthesis Text Generation

Paper
Add Code

Towards a Common Speech Analysis Engine

no code implementations • 1 Mar 2022 • Hagai Aronowitz, Itai Gat, Edmilson Morais, Weizhong Zhu, Ron Hoory

Beyond that, a common engine should be capable of supporting distributed training with client in-house private data.

Emotion Recognition Language Identification +1

Paper
Add Code

A new data augmentation method for intent classification enhancement and its application on spoken conversation datasets

no code implementations • 21 Feb 2022 • Zvi Kons, Aharon Satt, Hong-Kwang Kuo, Samuel Thomas, Boaz Carmeli, Ron Hoory, Brian Kingsbury

The NNSI reduces the need for manual labeling by automatically selecting highly-ambiguous samples and labeling them with high accuracy.

Active Learning Data Augmentation +2

Paper
Add Code

Speech Emotion Recognition using Self-Supervised Features

no code implementations • ICASSP 2022 • Edmilson Morais, Ron Hoory, Weizhong Zhu, Itai Gat, Matheus Damasceno, Hagai Aronowitz

Self-supervised pre-trained features have consistently delivered state-of-art results in the field of natural language processing (NLP); however, their merits in the field of speech emotion recognition (SER) still need further investigation.

Speech Emotion Recognition

Paper
Add Code

Speaker Normalization for Self-supervised Speech Emotion Recognition

no code implementations • 2 Feb 2022 • Itai Gat, Hagai Aronowitz, Weizhong Zhu, Edmilson Morais, Ron Hoory

Large speech emotion recognition datasets are hard to obtain, and small datasets may contain biases.

Ranked #1 on Speech Emotion Recognition on IEMOCAP (AUC metric)

Speech Emotion Recognition

Paper
Add Code

RNN Transducer Models For Spoken Language Understanding

1 code implementation • 8 Apr 2021 • Samuel Thomas, Hong-Kwang J. Kuo, George Saon, Zoltán Tüske, Brian Kingsbury, Gakuto Kurata, Zvi Kons, Ron Hoory

We present a comprehensive study on building and adapting RNN transducer (RNN-T) models for spoken language understanding(SLU).

Automatic Speech Recognition Automatic Speech Recognition (ASR) +2

Paper
Code

Leveraging Unpaired Text Data for Training End-to-End Speech-to-Intent Systems

no code implementations • 8 Oct 2020 • Yinghui Huang, Hong-Kwang Kuo, Samuel Thomas, Zvi Kons, Kartik Audhkhasi, Brian Kingsbury, Ron Hoory, Michael Picheny

Assuming we have additional text-to-intent data (without speech) available, we investigated two techniques to improve the S2I system: (1) transfer learning, in which acoustic embeddings for intent classification are tied to fine-tuned BERT text embeddings; and (2) data augmentation, in which the text-to-intent data is converted into speech-to-intent data using a multi-speaker text-to-speech system.

Data Augmentation intent-classification +2

Paper
Add Code

End-to-End Spoken Language Understanding Without Full Transcripts

no code implementations • 30 Sep 2020 • Hong-Kwang J. Kuo, Zoltán Tüske, Samuel Thomas, Yinghui Huang, Kartik Audhkhasi, Brian Kingsbury, Gakuto Kurata, Zvi Kons, Ron Hoory, Luis Lastras

For our speech-to-entities experiments on the ATIS corpus, both the CTC and attention models showed impressive ability to skip non-entity words: there was little degradation when trained on just entities versus full transcripts.

slot-filling Slot Filling +3

Paper
Add Code

Siamese x-vector reconstruction for domain adapted speaker recognition

no code implementations • 28 Jul 2020 • Shai Rozenberg, Hagai Aronowitz, Ron Hoory

With the rise of voice-activated applications, the need for speaker recognition is rapidly increasing.

Domain Adaptation Speaker Recognition

Paper
Add Code

High quality, lightweight and adaptable TTS using LPCNet

no code implementations • 2 May 2019 • Zvi Kons, Slava Shechtman, Alex Sorin, Carmel Rabinovitz, Ron Hoory

We first demonstrate the ability of the system to produce high quality speech when trained on large, high quality datasets.

Audio and Speech Processing Sound

Paper
Add Code

Cannot find the paper you are looking for? You can Submit a new open access paper.