Search Results for author: Joel Shor

Found 18 papers, 5 papers with code

Predicting Generalization of AI Colonoscopy Models to Unseen Data

no code implementations • 14 Mar 2024 • Joel Shor, Carson McNeil, Yotam Intrator, Joseph R Ledsam, Hiro-o Yamano, Daisuke Tsurumaru, Hiroki Kayama, Atsushi Hamabe, Koji Ando, Mitsuhiko Ota, Haruei Ogino, Hiroshi Nakase, Kaho Kobayashi, Masaaki Miyo, Eiji Oki, Ichiro Takemasa, Ehud Rivlin, Roman Goldenberg

We test MSN's ability to be trained on data only from Israel and detect unseen techniques, narrow-band imaging (NBI) and chromendoscoy (CE), on colonoscopes from Japan (354 videos, 128 hours).

Paper
Add Code

The unreasonable effectiveness of AI CADe polyp detectors to generalize to new countries

no code implementations • 11 Dec 2023 • Joel Shor, Hiro-o Yamano, Daisuke Tsurumaru, Yotami Intrator, Hiroki Kayama, Joe Ledsam, Atsushi Hamabe, Koji Ando, Mitsuhiko Ota, Haruei Ogino, Hiroshi Nakase, Kaho Kobayashi, Eiji Oki, Roman Goldenberg, Ehud Rivlin, Ichiro Takemasa

$\textbf{Conclusion}$: Differences that prevent CADe detectors from performing well in non-medical settings do not degrade the performance of our AI CADe polyp detector when applied to data from a new country.

Paper
Add Code

Clinical BERTScore: An Improved Measure of Automatic Speech Recognition Performance in Clinical Settings

no code implementations • 10 Mar 2023 • Joel Shor, Ruyue Agnes Bi, Subhashini Venugopalan, Steven Ibara, Roman Goldenberg, Ehud Rivlin

We demonstrate that this metric more closely aligns with clinician preferences on medical sentences as compared to other metrics (WER, BLUE, METEOR, etc), sometimes by wide margins.

Automatic Speech Recognition Automatic Speech Recognition (ASR) +1

Paper
Add Code

Knowledge distillation for fast and accurate DNA sequence correction

no code implementations • 17 Nov 2022 • Anastasiya Belyaeva, Joel Shor, Daniel E. Cook, Kishwar Shafin, Daniel Liu, Armin Töpfer, Aaron M. Wenger, William J. Rowell, Howard Yang, Alexey Kolesnikov, Cory Y. McLean, Maria Nattestad, Andrew Carroll, Pi-Chuan Chang

Accurate genome sequencing can improve our understanding of biology and the genetic basis of disease.

Knowledge Distillation

Paper
Add Code

The Need for Medically Aware Video Compression in Gastroenterology

no code implementations • 2 Nov 2022 • Joel Shor, Nick Johnston

Compression is essential to storing and transmitting medical videos, but the effect of compression on downstream medical tasks is often ignored.

Video Compression

Paper
Add Code

TRILLsson: Distilled Universal Paralinguistic Speech Representations

no code implementations • 1 Mar 2022 • Joel Shor, Subhashini Venugopalan

Our largest distilled model is less than 15% the size of the original model (314MB vs 2. 2GB), achieves over 96% the accuracy on 6 of 7 tasks, and is trained on 6. 5% the data.

Emotion Recognition Knowledge Distillation

Paper
Add Code

Universal Paralinguistic Speech Representations Using Self-Supervised Conformers

no code implementations • 9 Oct 2021 • Joel Shor, Aren Jansen, Wei Han, Daniel Park, Yu Zhang

Many speech applications require understanding aspects beyond the words being spoken, such as recognizing emotion, detecting whether the speaker is wearing a mask, or distinguishing real from synthetic speech.

Paper
Add Code

BigSSL: Exploring the Frontier of Large-Scale Semi-Supervised Learning for Automatic Speech Recognition

no code implementations • 27 Sep 2021 • Yu Zhang, Daniel S. Park, Wei Han, James Qin, Anmol Gulati, Joel Shor, Aren Jansen, Yuanzhong Xu, Yanping Huang, Shibo Wang, Zongwei Zhou, Bo Li, Min Ma, William Chan, Jiahui Yu, Yongqiang Wang, Liangliang Cao, Khe Chai Sim, Bhuvana Ramabhadran, Tara N. Sainath, Françoise Beaufays, Zhifeng Chen, Quoc V. Le, Chung-Cheng Chiu, Ruoming Pang, Yonghui Wu

We summarize the results of a host of efforts using giant automatic speech recognition (ASR) models pre-trained using large, diverse unlabeled datasets containing approximately a million hours of audio.

Ranked #1 on Speech Recognition on Common Voice

Automatic Speech Recognition Automatic Speech Recognition (ASR) +3

Paper
Add Code

Comparing Supervised Models And Learned Speech Representations For Classifying Intelligibility Of Disordered Speech On Selected Phrases

no code implementations • 8 Jul 2021 • Subhashini Venugopalan, Joel Shor, Manoj Plakal, Jimmy Tobin, Katrin Tomanek, Jordan R. Green, Michael P. Brenner

Automatic classification of disordered speech can provide an objective tool for identifying the presence and severity of speech impairment.

Task 2

Paper
Add Code

Towards Learning a Universal Non-Semantic Representation of Speech

1 code implementation • 25 Feb 2020 • Joel Shor, Aren Jansen, Ronnie Maor, Oran Lang, Omry Tuval, Felix de Chaumont Quitry, Marco Tagliasacchi, Ira Shavitt, Dotan Emanuel, Yinnon Haviv

The ultimate goal of transfer learning is to reduce labeled data requirements by exploiting a pre-existing embedding model trained for different datasets or tasks.

Transfer Learning

33,049

Paper
Code

Personalizing ASR for Dysarthric and Accented Speech with Limited Data

1 code implementation • 31 Jul 2019 • Joel Shor, Dotan Emanuel, Oran Lang, Omry Tuval, Michael Brenner, Julie Cattiau, Fernando Vieira, Maeve McNally, Taylor Charbonneau, Melissa Nollstadt, Avinatan Hassidim, Yossi Matias

In this paper, we present and evaluate finetuning techniques to improve ASR for users with non-standard speech.

Automatic Speech Recognition Automatic Speech Recognition (ASR) +1

Paper
Code

Towards End-to-End Prosody Transfer for Expressive Speech Synthesis with Tacotron

2 code implementations • ICML 2018 • RJ Skerry-Ryan, Eric Battenberg, Ying Xiao, Yuxuan Wang, Daisy Stanton, Joel Shor, Ron J. Weiss, Rob Clark, Rif A. Saurous

We present an extension to the Tacotron speech synthesis architecture that learns a latent embedding space of prosody, derived from a reference acoustic representation containing the desired prosody.

Expressive Speech Synthesis

368

Paper
Code

Style Tokens: Unsupervised Style Modeling, Control and Transfer in End-to-End Speech Synthesis

11 code implementations • ICML 2018 • Yuxuan Wang, Daisy Stanton, Yu Zhang, RJ Skerry-Ryan, Eric Battenberg, Joel Shor, Ying Xiao, Fei Ren, Ye Jia, Rif A. Saurous

In this work, we propose "global style tokens" (GSTs), a bank of embeddings that are jointly trained within Tacotron, a state-of-the-art end-to-end speech synthesis system.

Speech Synthesis Style Transfer +1

10,289

Paper
Code

Spatially adaptive image compression using a tiled deep network

no code implementations • 7 Feb 2018 • David Minnen, George Toderici, Michele Covell, Troy Chinen, Nick Johnston, Joel Shor, Sung Jin Hwang, Damien Vincent, Saurabh Singh

Deep neural networks represent a powerful class of function approximators that can learn to compress and reconstruct images.

Image Compression

Paper
Add Code

Uncovering Latent Style Factors for Expressive Speech Synthesis

no code implementations • 1 Nov 2017 • Yuxuan Wang, RJ Skerry-Ryan, Ying Xiao, Daisy Stanton, Joel Shor, Eric Battenberg, Rob Clark, Rif A. Saurous

Prosodic modeling is a core problem in speech synthesis.

Expressive Speech Synthesis

Paper
Add Code

Target-Quality Image Compression with Recurrent, Convolutional Neural Networks

no code implementations • 18 May 2017 • Michele Covell, Nick Johnston, David Minnen, Sung Jin Hwang, Joel Shor, Saurabh Singh, Damien Vincent, George Toderici

Our methods introduce a multi-pass training method to combine the training goals of high-quality reconstructions in areas around stop-code masking as well as in highly-detailed areas.

Image Compression

Paper
Add Code

Improved Lossy Image Compression with Priming and Spatially Adaptive Bit Rates for Recurrent Networks

no code implementations • CVPR 2018 • Nick Johnston, Damien Vincent, David Minnen, Michele Covell, Saurabh Singh, Troy Chinen, Sung Jin Hwang, Joel Shor, George Toderici

We propose a method for lossy image compression based on recurrent, convolutional neural networks that outperforms BPG (4:2:0 ), WebP, JPEG2000, and JPEG as measured by MS-SSIM.

Image Compression MS-SSIM +1

Paper
Add Code

Full Resolution Image Compression with Recurrent Neural Networks

7 code implementations • CVPR 2017 • George Toderici, Damien Vincent, Nick Johnston, Sung Jin Hwang, David Minnen, Joel Shor, Michele Covell

As far as we know, this is the first neural network architecture that is able to outperform JPEG at image compression across most bitrates on the rate-distortion curve on the Kodak dataset images, with and without the aid of entropy coding.

Decoder Image Compression

65,339

Paper
Code

Cannot find the paper you are looking for? You can Submit a new open access paper.