Search Results for author: Joseph Keshet

Found 37 papers, 10 papers with code

DiffAR: Denoising Diffusion Autoregressive Model for Raw Speech Waveform Generation

1 code implementation2 Oct 2023 Roi Benita, Michael Elad, Joseph Keshet

Diffusion models have recently been shown to be relevant for high-quality speech generation.

Denoising valid

Open-vocabulary Keyword-spotting with Adaptive Instance Normalization

no code implementations13 Sep 2023 Aviv Navon, Aviv Shamsian, Neta Glazer, Gill Hetz, Joseph Keshet

Open vocabulary keyword spotting is a crucial and challenging task in automatic speech recognition (ASR) that focuses on detecting user-defined keywords within a spoken utterance.

Automatic Speech Recognition Automatic Speech Recognition (ASR) +2

A Baseline for Detecting Out-of-Distribution Examples in Image Captioning

no code implementations12 Jul 2022 Gabi Shalev, Gal-Lev Shalev, Joseph Keshet

Image captioning research achieved breakthroughs in recent years by developing neural models that can generate diverse and high-quality descriptions for images drawn from the same distribution as training images.

Image Captioning Out of Distribution (OOD) Detection

DDKtor: Automatic Diadochokinetic Speech Analysis

1 code implementation29 Jun 2022 Yael Segal, Kasia Hitczenko, Matthew Goldrick, Adam Buchwald, Angela Roberts, Joseph Keshet

These segmentations predicted by the models are used to obtain measures of speech rate and sound duration.

THOR: Threshold-Based Ranking Loss for Ordinal Regression

no code implementations10 May 2022 Tzeviya Sylvia Fuchs, Joseph Keshet

In this work, we present a regression-based ordinal regression algorithm for supervised classification of instances into ordinal categories.

regression

Self-supervised Speaker Diarization

no code implementations8 Apr 2022 Yehoshua Dissen, Felix Kreuk, Joseph Keshet

Specifically, the study focuses on generating high-quality neural speaker representations without any annotated data, as well as on estimating secondary hyperparameters of the model without annotations.

speaker-diarization Speaker Diarization +1

DeepFry: Identifying Vocal Fry Using Deep Neural Networks

1 code implementation31 Mar 2022 Bronya R. Chernyak, Talia Ben Simon, Yael Segal, Jeremy Steffman, Eleanor Chodroff, Jennifer S. Cole, Joseph Keshet

The classifier is implemented as a multi-headed fully-connected network trained to detect creaky voice, voicing, and pitch, where the last two are used to refine creak prediction.

Constant Random Perturbations Provide Adversarial Robustness with Minimal Effect on Accuracy

1 code implementation15 Mar 2021 Bronya Roni Chernyak, Bhiksha Raj, Tamir Hazan, Joseph Keshet

This paper proposes an attack-independent (non-adversarial training) technique for improving adversarial robustness of neural network models, with minimal loss of standard accuracy.

Adversarial Robustness

CNN-based Spoken Term Detection and Localization without Dynamic Programming

no code implementations7 Mar 2021 Tzeviya Sylvia Fuchs, Yael Segal, Joseph Keshet

In this paper, we propose a spoken term detection algorithm for simultaneous prediction and localization of in-vocabulary and out-of-vocabulary terms within an audio segment.

Word Embeddings

Redesigning the classification layer by randomizing the class representation vectors

no code implementations16 Nov 2020 Gabi Shalev, Gal-Lev Shalev, Joseph Keshet

We propose to draw the class vectors randomly and set them as fixed during training, thus invalidating the visual similarities encoded in these vectors.

Classification General Classification +1

Fairness in the Eyes of the Data: Certifying Machine-Learning Models

no code implementations3 Sep 2020 Shahar Segal, Yossi Adi, Benny Pinkas, Carsten Baum, Chaya Ganesh, Joseph Keshet

We present a framework that allows to certify the fairness degree of a model based on an interactive and privacy-preserving test.

BIG-bench Machine Learning Fairness +1

Self-Supervised Contrastive Learning for Unsupervised Phoneme Segmentation

2 code implementations27 Jul 2020 Felix Kreuk, Joseph Keshet, Yossi Adi

Results suggest that our approach surpasses the baseline models and reaches state-of-the-art performance on both data sets.

Boundary Detection Contrastive Learning +1

Phoneme Boundary Detection using Learnable Segmental Features

1 code implementation11 Feb 2020 Felix Kreuk, Yaniv Sheena, Joseph Keshet, Yossi Adi

Phoneme boundary detection plays an essential first step for a variety of speech processing applications such as speaker diarization, speech science, keyword spotting, etc.

Boundary Detection Keyword Spotting +2

Dr.VOT : Measuring Positive and Negative Voice Onset Time in the Wild

1 code implementation27 Oct 2019 Yosi Shrem, Matthew Goldrick, Joseph Keshet

Voice Onset Time (VOT), a key measurement of speech for basic research and applied medical studies, is the time between the onset of a stop burst and the onset of voicing.

Multi-Task Learning Structured Prediction

SpeechYOLO: Detection and Localization of Speech Objects

no code implementations14 Apr 2019 Yael Segal, Tzeviya Sylvia Fuchs, Joseph Keshet

In this paper, we propose to apply object detection methods from the vision domain on the speech recognition domain, by treating audio fragments as objects.

General Classification Keyword Spotting +5

Hide and Speak: Towards Deep Neural Networks for Speech Steganography

1 code implementation7 Feb 2019 Felix Kreuk, Yossi Adi, Bhiksha Raj, Rita Singh, Joseph Keshet

Steganography is the science of hiding a secret message within an ordinary public message, which is referred to as Carrier.

Fooling End-to-end Speaker Verification by Adversarial Examples

no code implementations10 Jan 2018 Felix Kreuk, Yossi Adi, Moustapha Cisse, Joseph Keshet

We also present two black-box attacks: where the adversarial examples were generated with a system that was trained on YOHO, but the attack is on a system that was trained on NTIMIT; and when the adversarial examples were generated with a system that was trained on Mel-spectrum feature set, but the attack is on a system that was trained on MFCC.

Speaker Verification

Houdini: Fooling Deep Structured Prediction Models

no code implementations17 Jul 2017 Moustapha Cisse, Yossi Adi, Natalia Neverova, Joseph Keshet

Generating adversarial examples is a critical step for evaluating and improving the robustness of learning machines.

General Classification Pose Estimation +4

Automatic Measurement of Pre-aspiration

no code implementations5 Apr 2017 Yaniv Sheena, Míša Hejná, Yossi Adi, Joseph Keshet

Pre-aspiration is defined as the period of glottal friction occurring in sequences of vocalic/consonantal sonorants and phonetically voiceless obstruents.

Friction Structured Prediction

Learning Similarity Functions for Pronunciation Variations

no code implementations28 Mar 2017 Einat Naaman, Yossi Adi, Joseph Keshet

This task generalizes problems such as lexical access (the problem of learning the mapping between words and their possible pronunciations), and defining word neighborhoods.

Automatic Speech Recognition Automatic Speech Recognition (ASR) +2

Domain Adaptation For Formant Estimation Using Deep Learning

no code implementations6 Nov 2016 Yehoshua Dissen, Joseph Keshet, Jacob Goldberger, Cynthia Clopper

We then freeze the parameters of the trained network and use several different datasets to train an adaptation layer that makes the obtained network universal in the sense that it works well for a variety of speakers and speech domains with very different characteristics.

Domain Adaptation

Automatic measurement of vowel duration via structured prediction

1 code implementation26 Oct 2016 Yossi Adi, Joseph Keshet, Emily Cibelli, Erin Gustafson, Cynthia Clopper, Matthew Goldrick

Manually-annotated data were used to train a model that takes as input an arbitrary length segment of the acoustic signal containing a single vowel that is preceded and followed by consonants and outputs the duration of the vowel.

Structured Prediction

Sequence Segmentation Using Joint RNN and Structured Prediction Models

no code implementations25 Oct 2016 Yossi Adi, Joseph Keshet, Emily Cibelli, Matthew Goldrick

We describe and analyze a simple and effective algorithm for sequence segmentation applied to speech processing tasks.

Segmentation Structured Prediction

Context-Based Prediction of App Usage

no code implementations24 Dec 2015 Joseph Keshet, Adam Kariv, Arnon Dagan, Dvir Volk, Joey Simhon

There are around a hundred installed apps on an average smartphone.

Navigate

Risk Minimization in Structured Prediction using Orbit Loss

no code implementations7 Dec 2015 Danny Karmon, Joseph Keshet

Methods that are aimed at risk minimization, such as the structured ramp loss, the structured probit loss and the direct loss minimization require at least two inference operations per training iteration.

Structured Prediction

Direct Loss Minimization for Structured Prediction

no code implementations NeurIPS 2010 Tamir Hazan, Joseph Keshet, David A. Mcallester

In discriminative machine learning one is interested in training a system to optimize a certain desired measure of performance, or loss.

Binary Classification Machine Translation +2

Support Vector Machines with a Reject Option

no code implementations NeurIPS 2008 Yves Grandvalet, Alain Rakotomamonjy, Joseph Keshet, Stéphane Canu

We consider the problem of binary classification where the classifier may abstain instead of classifying each observation.

Binary Classification General Classification

Cannot find the paper you are looking for? You can Submit a new open access paper.