Search Results for author: David Kanter

Found 12 papers, 8 papers with code

Speech Wikimedia: A 77 Language Multilingual Speech Dataset

1 code implementation30 Aug 2023 Rafael Mosquera Gómez, Julián Eusse, Juan Ciro, Daniel Galvez, Ryan Hileman, Kurt Bollacker, David Kanter

The Speech Wikimedia Dataset is a publicly available compilation of audio with transcriptions extracted from Wikimedia Commons.

Machine Translation speech-recognition +2

LSH methods for data deduplication in a Wikipedia artificial dataset

no code implementations10 Dec 2021 Juan Ciro, Daniel Galvez, Tim Schlippe, David Kanter

This paper illustrates locality sensitive hasing (LSH) models for the identification and removal of nearly redundant data in a text dataset.

The People's Speech: A Large-Scale Diverse English Speech Recognition Dataset for Commercial Usage

no code implementations17 Nov 2021 Daniel Galvez, Greg Diamos, Juan Ciro, Juan Felipe Cerón, Keith Achorn, Anjali Gopi, David Kanter, Maximilian Lam, Mark Mazumder, Vijay Janapa Reddi

The People's Speech is a free-to-download 30, 000-hour and growing supervised conversational English speech recognition dataset licensed for academic and commercial usage under CC-BY-SA (with a CC-BY subset).

speech-recognition Speech Recognition

Data Engineering for Everyone

no code implementations23 Feb 2021 Vijay Janapa Reddi, Greg Diamos, Pete Warden, Peter Mattson, David Kanter

This article shows that open-source data sets are the rocket fuel for research and innovation at even some of the largest AI organizations.

BIG-bench Machine Learning

Benchmarking TinyML Systems: Challenges and Direction

2 code implementations10 Mar 2020 Colby R. Banbury, Vijay Janapa Reddi, Max Lam, William Fu, Amin Fazel, Jeremy Holleman, Xinyuan Huang, Robert Hurtado, David Kanter, Anton Lokhmotov, David Patterson, Danilo Pau, Jae-sun Seo, Jeff Sieracki, Urmish Thakker, Marian Verhelst, Poonam Yadav

In this position paper, we present the current landscape of TinyML and discuss the challenges and direction towards developing a fair and useful hardware benchmark for TinyML workloads.


Cannot find the paper you are looking for? You can Submit a new open access paper.