no code implementations • NAACL (maiworkshop) 2021 • Gal-Lev Shalev, Gabi Shalev, Joseph Keshet
In natural language generation tasks, a neural language model is used for generating a sequence of words forming a sentence.
2 code implementations • 24 Sep 2024 • Yael Segal-Feldman, Aviv Shamsian, Aviv Navon, Gill Hetz, Joseph Keshet
Large transformer-based models have significant potential for speech transcription and translation.
2 code implementations • 12 Sep 2024 • Gil Ayache, Menachem Pirchi, Aviv Navon, Aviv Shamsian, Gill Hetz, Joseph Keshet
In this paper, we introduce WhisperNER, a novel model that allows joint speech transcription and entity recognition.
Automatic Speech Recognition Automatic Speech Recognition (ASR) +5
no code implementations • 10 Jul 2024 • Arnon Turetzky, Or Tal, Yael Segal-Feldman, Yehoshua Dissen, Ella Zeldes, Amit Roth, Eyal Cohen, Yosi Shrem, Bronya R. Chernyak, Olga Seleznova, Joseph Keshet, Yossi Adi
We present HebDB, a weakly supervised dataset for spoken language processing in the Hebrew language.
Automatic Speech Recognition Automatic Speech Recognition (ASR) +1
no code implementations • 27 Jun 2024 • Rotem Rousso, Eyal Cohen, Joseph Keshet, Eleanor Chodroff
Forced alignment (FA) plays a key role in speech research through the automatic time alignment of speech signals with corresponding text transcriptions.
Automatic Speech Recognition Automatic Speech Recognition (ASR) +2
1 code implementation • 27 Jun 2024 • Yehoshua Dissen, Shiry Yonash, Israel Cohen, Joseph Keshet
In the realm of automatic speech recognition (ASR), robustness in noisy environments remains a significant challenge.
Automatic Speech Recognition Automatic Speech Recognition (ASR) +1
no code implementations • 4 Jun 2024 • Aviv Shamsian, Aviv Navon, Neta Glazer, Gill Hetz, Joseph Keshet
Automatic Speech Recognition (ASR) technology has made significant progress in recent years, providing accurate transcription across various domains.
Automatic Speech Recognition Automatic Speech Recognition (ASR) +3
no code implementations • 30 Oct 2023 • Daniel Eitan, Menachem Pirchi, Neta Glazer, Shai Meital, Gil Ayach, Gidon Krendel, Aviv Shamsian, Aviv Navon, Gil Hetz, Joseph Keshet
In this work, we introduce a novel approach that integrates domain-specific or secondary LM into general-purpose LM.
1 code implementation • 2 Oct 2023 • Roi Benita, Michael Elad, Joseph Keshet
Diffusion models have recently been shown to be relevant for high-quality speech generation.
no code implementations • 13 Sep 2023 • Aviv Navon, Aviv Shamsian, Neta Glazer, Gill Hetz, Joseph Keshet
Open vocabulary keyword spotting is a crucial and challenging task in automatic speech recognition (ASR) that focuses on detecting user-defined keywords within a spoken utterance.
Automatic Speech Recognition Automatic Speech Recognition (ASR) +2
no code implementations • 12 Jul 2022 • Gabi Shalev, Gal-Lev Shalev, Joseph Keshet
Image captioning research achieved breakthroughs in recent years by developing neural models that can generate diverse and high-quality descriptions for images drawn from the same distribution as training images.
1 code implementation • 29 Jun 2022 • Yael Segal, Kasia Hitczenko, Matthew Goldrick, Adam Buchwald, Angela Roberts, Joseph Keshet
These segmentations predicted by the models are used to obtain measures of speech rate and sound duration.
no code implementations • 10 May 2022 • Tzeviya Sylvia Fuchs, Joseph Keshet
In this work, we present a regression-based ordinal regression algorithm for supervised classification of instances into ordinal categories.
no code implementations • 8 Apr 2022 • Yehoshua Dissen, Felix Kreuk, Joseph Keshet
Specifically, the study focuses on generating high-quality neural speaker representations without any annotated data, as well as on estimating secondary hyperparameters of the model without annotations.
no code implementations • 7 Apr 2022 • Talia Ben-Simon, Felix Kreuk, Faten Awwad, Jacob T. Cohen, Joseph Keshet
Grownup learners of a language tweak their speech to match the tutor reference.
1 code implementation • 31 Mar 2022 • Bronya R. Chernyak, Talia Ben Simon, Yael Segal, Jeremy Steffman, Eleanor Chodroff, Jennifer S. Cole, Joseph Keshet
The classifier is implemented as a multi-headed fully-connected network trained to detect creaky voice, voicing, and pitch, where the last two are used to refine creak prediction.
1 code implementation • 15 Mar 2021 • Bronya Roni Chernyak, Bhiksha Raj, Tamir Hazan, Joseph Keshet
This paper proposes an attack-independent (non-adversarial training) technique for improving adversarial robustness of neural network models, with minimal loss of standard accuracy.
no code implementations • 7 Mar 2021 • Tzeviya Sylvia Fuchs, Yael Segal, Joseph Keshet
In this paper, we propose a spoken term detection algorithm for simultaneous prediction and localization of in-vocabulary and out-of-vocabulary terms within an audio segment.
no code implementations • 16 Nov 2020 • Gabi Shalev, Gal-Lev Shalev, Joseph Keshet
We propose to draw the class vectors randomly and set them as fixed during training, thus invalidating the visual similarities encoded in these vectors.
no code implementations • 3 Sep 2020 • Shahar Segal, Yossi Adi, Benny Pinkas, Carsten Baum, Chaya Ganesh, Joseph Keshet
We present a framework that allows to certify the fairness degree of a model based on an interactive and privacy-preserving test.
2 code implementations • 27 Jul 2020 • Felix Kreuk, Joseph Keshet, Yossi Adi
Results suggest that our approach surpasses the baseline models and reaches state-of-the-art performance on both data sets.
1 code implementation • 11 Feb 2020 • Felix Kreuk, Yaniv Sheena, Joseph Keshet, Yossi Adi
Phoneme boundary detection plays an essential first step for a variety of speech processing applications such as speaker diarization, speech science, keyword spotting, etc.
1 code implementation • 27 Oct 2019 • Yosi Shrem, Matthew Goldrick, Joseph Keshet
Voice Onset Time (VOT), a key measurement of speech for basic research and applied medical studies, is the time between the onset of a stop burst and the onset of voicing.
no code implementations • 14 Apr 2019 • Yael Segal, Tzeviya Sylvia Fuchs, Joseph Keshet
In this paper, we propose to apply object detection methods from the vision domain on the speech recognition domain, by treating audio fragments as objects.
1 code implementation • 7 Feb 2019 • Felix Kreuk, Yossi Adi, Bhiksha Raj, Rita Singh, Joseph Keshet
Steganography is the science of hiding a secret message within an ordinary public message, which is referred to as Carrier.
no code implementations • NeurIPS 2018 • Gabi Shalev, Yossi Adi, Joseph Keshet
Deep Neural Networks are powerful models that attained remarkable results on a variety of tasks.
no code implementations • 13 Feb 2018 • Felix Kreuk, Assi Barak, Shir Aviv-Reuven, Moran Baruch, Benny Pinkas, Joseph Keshet
Deep learning models have been successfully applied to malware detection.
2 code implementations • 13 Feb 2018 • Yossi Adi, Carsten Baum, Moustapha Cisse, Benny Pinkas, Joseph Keshet
Unfortunately, once the models are sold they can be easily copied and redistributed.
no code implementations • 10 Jan 2018 • Felix Kreuk, Yossi Adi, Moustapha Cisse, Joseph Keshet
We also present two black-box attacks: where the adversarial examples were generated with a system that was trained on YOHO, but the attack is on a system that was trained on NTIMIT; and when the adversarial examples were generated with a system that was trained on Mel-spectrum feature set, but the attack is on a system that was trained on MFCC.
no code implementations • NeurIPS 2017 • Moustapha M. Cisse, Yossi Adi, Natalia Neverova, Joseph Keshet
Generating adversarial examples is a critical step for evaluating and improving the robustness of learning machines.
no code implementations • 17 Jul 2017 • Moustapha Cisse, Yossi Adi, Natalia Neverova, Joseph Keshet
Generating adversarial examples is a critical step for evaluating and improving the robustness of learning machines.
no code implementations • 5 Apr 2017 • Yaniv Sheena, Míša Hejná, Yossi Adi, Joseph Keshet
Pre-aspiration is defined as the period of glottal friction occurring in sequences of vocalic/consonantal sonorants and phonetically voiceless obstruents.
no code implementations • 28 Mar 2017 • Einat Naaman, Yossi Adi, Joseph Keshet
This task generalizes problems such as lexical access (the problem of learning the mapping between words and their possible pronunciations), and defining word neighborhoods.
Automatic Speech Recognition Automatic Speech Recognition (ASR) +2
no code implementations • 6 Nov 2016 • Yehoshua Dissen, Joseph Keshet, Jacob Goldberger, Cynthia Clopper
We then freeze the parameters of the trained network and use several different datasets to train an adaptation layer that makes the obtained network universal in the sense that it works well for a variety of speakers and speech domains with very different characteristics.
1 code implementation • 26 Oct 2016 • Yossi Adi, Joseph Keshet, Emily Cibelli, Erin Gustafson, Cynthia Clopper, Matthew Goldrick
Manually-annotated data were used to train a model that takes as input an arbitrary length segment of the acoustic signal containing a single vowel that is preceded and followed by consonants and outputs the duration of the vowel.
no code implementations • 25 Oct 2016 • Yossi Adi, Joseph Keshet, Emily Cibelli, Matthew Goldrick
We describe and analyze a simple and effective algorithm for sequence segmentation applied to speech processing tasks.
no code implementations • 24 Dec 2015 • Joseph Keshet, Adam Kariv, Arnon Dagan, Dvir Volk, Joey Simhon
There are around a hundred installed apps on an average smartphone.
no code implementations • 7 Dec 2015 • Danny Karmon, Joseph Keshet
Methods that are aimed at risk minimization, such as the structured ramp loss, the structured probit loss and the direct loss minimization require at least two inference operations per training iteration.
no code implementations • NeurIPS 2013 • Tamir Hazan, Subhransu Maji, Joseph Keshet, Tommi Jaakkola
In this work we develop efficient methods for learning random MAP predictors for structured label problems.
no code implementations • NeurIPS 2011 • Joseph Keshet, David A. Mcallester
We consider latent structural versions of probit loss and ramp loss.
no code implementations • NeurIPS 2010 • Tamir Hazan, Joseph Keshet, David A. Mcallester
In discriminative machine learning one is interested in training a system to optimize a certain desired measure of performance, or loss.
no code implementations • NeurIPS 2008 • Yves Grandvalet, Alain Rakotomamonjy, Joseph Keshet, Stéphane Canu
We consider the problem of binary classification where the classifier may abstain instead of classifying each observation.