5 code implementations • Preprint 2022 • Alec Radford, Jong Wook Kim, Tao Xu, Greg Brockman, Christine McLeavey, Ilya Sutskever
We study the capabilities of speech processing systems trained simply to predict large amounts of transcripts of audio on the internet.
Ranked #1 on
Speech Recognition
on Common Voice English
(using extra training data)
1 code implementation • 24 Jan 2022 • Arvind Neelakantan, Tao Xu, Raul Puri, Alec Radford, Jesse Michael Han, Jerry Tworek, Qiming Yuan, Nikolas Tezak, Jong Wook Kim, Chris Hallacy, Johannes Heidecke, Pranav Shyam, Boris Power, Tyna Eloundou Nekoul, Girish Sastry, Gretchen Krueger, David Schnurr, Felipe Petroski Such, Kenny Hsu, Madeleine Thompson, Tabarak Khan, Toki Sherbakov, Joanne Jang, Peter Welinder, Lilian Weng
Similarly to text embeddings, we train code embedding models on (text, code) pairs, obtaining a 20. 8% relative improvement over prior best work on code search.
Ranked #1 on
Code Search
on CodeSearchNet
3 code implementations • CVPR 2022 • Mitchell Wortsman, Gabriel Ilharco, Jong Wook Kim, Mike Li, Simon Kornblith, Rebecca Roelofs, Raphael Gontijo-Lopes, Hannaneh Hajishirzi, Ali Farhadi, Hongseok Namkoong, Ludwig Schmidt
Compared to standard fine-tuning, WiSE-FT provides large accuracy improvements under distribution shift, while preserving high accuracy on the target distribution.
Ranked #10 on
Image Classification
on ObjectNet
(using extra training data)
no code implementations • 5 Aug 2021 • Sandhini Agarwal, Gretchen Krueger, Jack Clark, Alec Radford, Jong Wook Kim, Miles Brundage
Recently, there have been breakthroughs in computer vision ("CV") models that are more generalizable with the advent of models such as CLIP and ALIGN.
41 code implementations • 26 Feb 2021 • Alec Radford, Jong Wook Kim, Chris Hallacy, Aditya Ramesh, Gabriel Goh, Sandhini Agarwal, Girish Sastry, Amanda Askell, Pamela Mishkin, Jack Clark, Gretchen Krueger, Ilya Sutskever
State-of-the-art computer vision systems are trained to predict a fixed set of predetermined object categories.
Ranked #1 on
Zero-Shot Learning
on COCO-MLT
(using extra training data)
no code implementations • 14 Jan 2021 • Kennedy Edemacu, Beakcheol Jang, Jong Wook Kim
Multi-party machine learning is a paradigm in which multiple participants collaboratively train a machine learning model to achieve a common learning objective without sharing their privately owned data.
12 code implementations • Preprint 2020 • Prafulla Dhariwal, Heewoo Jun, Christine Payne, Jong Wook Kim, Alec Radford, Ilya Sutskever
We introduce Jukebox, a model that generates music with singing in the raw audio domain.
no code implementations • 26 Aug 2019 • Jong Seon Kim, Jong Wook Kim, Yon Dohn Chung
A point-of-interest (POI) recommendation system performs an important role in location-based services because it can help people to explore new locations and promote advertisers to launch advertisements at appropriate locations.
no code implementations • 24 Aug 2019 • Irene Solaiman, Miles Brundage, Jack Clark, Amanda Askell, Ariel Herbert-Voss, Jeff Wu, Alec Radford, Gretchen Krueger, Jong Wook Kim, Sarah Kreps, Miles McCain, Alex Newhouse, Jason Blazakis, Kris McGuffie, Jasmine Wang
Large language models have a range of beneficial uses: they can assist in prose, poetry, and programming; analyze dataset biases; and more.
no code implementations • 20 Jun 2019 • Jong Wook Kim, Juan Pablo Bello
Automatic music transcription is considered to be one of the hardest problems in music information retrieval, yet recent deep learning approaches have achieved substantial improvements on transcription performance.
no code implementations • 1 Nov 2018 • Jong Wook Kim, Rachel Bittner, Aparna Kumar, Juan Pablo Bello
The recent success of raw audio waveform synthesis models like WaveNet motivates a new approach for music synthesis, in which the entire process --- creating audio samples from a score and instrument information --- is modeled using generative neural networks.
1 code implementation • 17 Feb 2018 • Jong Wook Kim, Justin Salamon, Peter Li, Juan Pablo Bello
To date, the best performing techniques, such as the pYIN algorithm, are based on a combination of DSP pipelines and heuristics.