Search Results for author: Joseph Keshet

Found 37 papers, 10 papers with code

On Randomized Classification Layers and Their Implications in Natural Language Generation

no code implementations • NAACL (maiworkshop) 2021 • Gal-Lev Shalev, Gabi Shalev, Joseph Keshet

In natural language generation tasks, a neural language model is used for generating a sequence of words forming a sentence.

Image Captioning Language Modelling +4

Paper
Add Code

Combining Language Models For Specialized Domains: A Colorful Approach

no code implementations • 30 Oct 2023 • Daniel Eitan, Menachem Pirchi, Neta Glazer, Shai Meital, Gil Ayach, Gidon Krendel, Aviv Shamsian, Aviv Navon, Gil Hetz, Joseph Keshet

In this work, we introduce a novel approach that integrates domain-specific or secondary LM into general-purpose LM.

Automatic Speech Recognition speech-recognition +1

Paper
Add Code

DiffAR: Denoising Diffusion Autoregressive Model for Raw Speech Waveform Generation

1 code implementation • 2 Oct 2023 • Roi Benita, Michael Elad, Joseph Keshet

Diffusion models have recently been shown to be relevant for high-quality speech generation.

Denoising valid

Paper
Code

Open-vocabulary Keyword-spotting with Adaptive Instance Normalization

no code implementations • 13 Sep 2023 • Aviv Navon, Aviv Shamsian, Neta Glazer, Gill Hetz, Joseph Keshet

Open vocabulary keyword spotting is a crucial and challenging task in automatic speech recognition (ASR) that focuses on detecting user-defined keywords within a spoken utterance.

Automatic Speech Recognition Automatic Speech Recognition (ASR) +2

Paper
Add Code

A Baseline for Detecting Out-of-Distribution Examples in Image Captioning

no code implementations • 12 Jul 2022 • Gabi Shalev, Gal-Lev Shalev, Joseph Keshet

Image captioning research achieved breakthroughs in recent years by developing neural models that can generate diverse and high-quality descriptions for images drawn from the same distribution as training images.

Image Captioning Out of Distribution (OOD) Detection

Paper
Add Code

DDKtor: Automatic Diadochokinetic Speech Analysis

1 code implementation • 29 Jun 2022 • Yael Segal, Kasia Hitczenko, Matthew Goldrick, Adam Buchwald, Angela Roberts, Joseph Keshet

These segmentations predicted by the models are used to obtain measures of speech rate and sound duration.

Paper
Code

THOR: Threshold-Based Ranking Loss for Ordinal Regression

no code implementations • 10 May 2022 • Tzeviya Sylvia Fuchs, Joseph Keshet

In this work, we present a regression-based ordinal regression algorithm for supervised classification of instances into ordinal categories.

regression

Paper
Add Code

Self-supervised Speaker Diarization

no code implementations • 8 Apr 2022 • Yehoshua Dissen, Felix Kreuk, Joseph Keshet

Specifically, the study focuses on generating high-quality neural speaker representations without any annotated data, as well as on estimating secondary hyperparameters of the model without annotations.

speaker-diarization Speaker Diarization +1

Paper
Add Code

Correcting Mispronunciations in Speech using Spectrogram Inpainting

no code implementations • 7 Apr 2022 • Talia Ben-Simon, Felix Kreuk, Faten Awwad, Jacob T. Cohen, Joseph Keshet

Grownup learners of a language tweak their speech to match the tutor reference.

Paper
Add Code

DeepFry: Identifying Vocal Fry Using Deep Neural Networks

1 code implementation • 31 Mar 2022 • Bronya R. Chernyak, Talia Ben Simon, Yael Segal, Jeremy Steffman, Eleanor Chodroff, Jennifer S. Cole, Joseph Keshet

The classifier is implemented as a multi-headed fully-connected network trained to detect creaky voice, voicing, and pitch, where the last two are used to refine creak prediction.

Paper
Code

Constant Random Perturbations Provide Adversarial Robustness with Minimal Effect on Accuracy

1 code implementation • 15 Mar 2021 • Bronya Roni Chernyak, Bhiksha Raj, Tamir Hazan, Joseph Keshet

This paper proposes an attack-independent (non-adversarial training) technique for improving adversarial robustness of neural network models, with minimal loss of standard accuracy.

Adversarial Robustness

Paper
Code

CNN-based Spoken Term Detection and Localization without Dynamic Programming

no code implementations • 7 Mar 2021 • Tzeviya Sylvia Fuchs, Yael Segal, Joseph Keshet

In this paper, we propose a spoken term detection algorithm for simultaneous prediction and localization of in-vocabulary and out-of-vocabulary terms within an audio segment.

Word Embeddings

Paper
Add Code

Redesigning the classification layer by randomizing the class representation vectors

no code implementations • 16 Nov 2020 • Gabi Shalev, Gal-Lev Shalev, Joseph Keshet

We propose to draw the class vectors randomly and set them as fixed during training, thus invalidating the visual similarities encoded in these vectors.

Classification General Classification +1

Paper
Add Code

Fairness in the Eyes of the Data: Certifying Machine-Learning Models

no code implementations • 3 Sep 2020 • Shahar Segal, Yossi Adi, Benny Pinkas, Carsten Baum, Chaya Ganesh, Joseph Keshet

We present a framework that allows to certify the fairness degree of a model based on an interactive and privacy-preserving test.

BIG-bench Machine Learning Fairness +1

Paper
Add Code

Self-Supervised Contrastive Learning for Unsupervised Phoneme Segmentation

2 code implementations • 27 Jul 2020 • Felix Kreuk, Joseph Keshet, Yossi Adi

Results suggest that our approach surpasses the baseline models and reaches state-of-the-art performance on both data sets.

Boundary Detection Contrastive Learning +1

135

Paper
Code

Phoneme Boundary Detection using Learnable Segmental Features

1 code implementation • 11 Feb 2020 • Felix Kreuk, Yaniv Sheena, Joseph Keshet, Yossi Adi

Phoneme boundary detection plays an essential first step for a variety of speech processing applications such as speaker diarization, speech science, keyword spotting, etc.

Boundary Detection Keyword Spotting +2

Paper
Code

Dr.VOT : Measuring Positive and Negative Voice Onset Time in the Wild

1 code implementation • 27 Oct 2019 • Yosi Shrem, Matthew Goldrick, Joseph Keshet

Voice Onset Time (VOT), a key measurement of speech for basic research and applied medical studies, is the time between the onset of a stop burst and the onset of voicing.

Multi-Task Learning Structured Prediction

Paper
Code

SpeechYOLO: Detection and Localization of Speech Objects

no code implementations • 14 Apr 2019 • Yael Segal, Tzeviya Sylvia Fuchs, Joseph Keshet

In this paper, we propose to apply object detection methods from the vision domain on the speech recognition domain, by treating audio fragments as objects.

General Classification Keyword Spotting +5

Paper
Add Code

Hide and Speak: Towards Deep Neural Networks for Speech Steganography

1 code implementation • 7 Feb 2019 • Felix Kreuk, Yossi Adi, Bhiksha Raj, Rita Singh, Joseph Keshet

Steganography is the science of hiding a secret message within an ordinary public message, which is referred to as Carrier.

Paper
Code

Out-of-Distribution Detection using Multiple Semantic Label Representations

no code implementations • NeurIPS 2018 • Gabi Shalev, Yossi Adi, Joseph Keshet

Deep Neural Networks are powerful models that attained remarkable results on a variety of tasks.

Out-of-Distribution Detection

Paper
Add Code

Turning Your Weakness Into a Strength: Watermarking Deep Neural Networks by Backdooring

2 code implementations • 13 Feb 2018 • Yossi Adi, Carsten Baum, Moustapha Cisse, Benny Pinkas, Joseph Keshet

Unfortunately, once the models are sold they can be easily copied and redistributed.

General Classification

Paper
Code

Deceiving End-to-End Deep Learning Malware Detectors using Adversarial Examples

no code implementations • 13 Feb 2018 • Felix Kreuk, Assi Barak, Shir Aviv-Reuven, Moran Baruch, Benny Pinkas, Joseph Keshet

Deep learning models have been successfully applied to malware detection.

Image Segmentation Malware Detection +4

Paper
Add Code

Fooling End-to-end Speaker Verification by Adversarial Examples

no code implementations • 10 Jan 2018 • Felix Kreuk, Yossi Adi, Moustapha Cisse, Joseph Keshet

We also present two black-box attacks: where the adversarial examples were generated with a system that was trained on YOHO, but the attack is on a system that was trained on NTIMIT; and when the adversarial examples were generated with a system that was trained on Mel-spectrum feature set, but the attack is on a system that was trained on MFCC.

Speaker Verification

Paper
Add Code

Houdini: Fooling Deep Structured Visual and Speech Recognition Models with Adversarial Examples

no code implementations • NeurIPS 2017 • Moustapha M. Cisse, Yossi Adi, Natalia Neverova, Joseph Keshet

Generating adversarial examples is a critical step for evaluating and improving the robustness of learning machines.

General Classification Pose Estimation +3

Paper
Add Code

Houdini: Fooling Deep Structured Prediction Models

no code implementations • 17 Jul 2017 • Moustapha Cisse, Yossi Adi, Natalia Neverova, Joseph Keshet

Generating adversarial examples is a critical step for evaluating and improving the robustness of learning machines.

General Classification Pose Estimation +4

Paper
Add Code

Automatic Measurement of Pre-aspiration

no code implementations • 5 Apr 2017 • Yaniv Sheena, Míša Hejná, Yossi Adi, Joseph Keshet

Pre-aspiration is defined as the period of glottal friction occurring in sequences of vocalic/consonantal sonorants and phonetically voiceless obstruents.

Friction Structured Prediction

Paper
Add Code

Learning Similarity Functions for Pronunciation Variations

no code implementations • 28 Mar 2017 • Einat Naaman, Yossi Adi, Joseph Keshet

This task generalizes problems such as lexical access (the problem of learning the mapping between words and their possible pronunciations), and defining word neighborhoods.

Automatic Speech Recognition Automatic Speech Recognition (ASR) +2

Paper
Add Code

Domain Adaptation For Formant Estimation Using Deep Learning

no code implementations • 6 Nov 2016 • Yehoshua Dissen, Joseph Keshet, Jacob Goldberger, Cynthia Clopper

We then freeze the parameters of the trained network and use several different datasets to train an adaptation layer that makes the obtained network universal in the sense that it works well for a variety of speakers and speech domains with very different characteristics.

Domain Adaptation

Paper
Add Code

Automatic measurement of vowel duration via structured prediction

1 code implementation • 26 Oct 2016 • Yossi Adi, Joseph Keshet, Emily Cibelli, Erin Gustafson, Cynthia Clopper, Matthew Goldrick

Manually-annotated data were used to train a model that takes as input an arbitrary length segment of the acoustic signal containing a single vowel that is preceded and followed by consonants and outputs the duration of the vowel.

Structured Prediction

Paper
Code

Sequence Segmentation Using Joint RNN and Structured Prediction Models

no code implementations • 25 Oct 2016 • Yossi Adi, Joseph Keshet, Emily Cibelli, Matthew Goldrick

We describe and analyze a simple and effective algorithm for sequence segmentation applied to speech processing tasks.

Segmentation Structured Prediction

Paper
Add Code

Context-Based Prediction of App Usage

no code implementations • 24 Dec 2015 • Joseph Keshet, Adam Kariv, Arnon Dagan, Dvir Volk, Joey Simhon

There are around a hundred installed apps on an average smartphone.

Navigate

Paper
Add Code

Risk Minimization in Structured Prediction using Orbit Loss

no code implementations • 7 Dec 2015 • Danny Karmon, Joseph Keshet

Methods that are aimed at risk minimization, such as the structured ramp loss, the structured probit loss and the direct loss minimization require at least two inference operations per training iteration.

Structured Prediction