RAVE: A variational autoencoder for fast and high-quality neural audio synthesis

caillonantoine/RAVE 9 Nov 2021

By leveraging a multi-band decomposition of the raw waveform, we show that our model is the first able to generate 48kHz audio signals, while simultaneously running 20 times faster than real-time on a standard laptop CPU.

Representation Learning

639
0.23 stars / hour

Multiview Compressive Coding for 3D Reconstruction

facebookresearch/mcc 19 Jan 2023

We introduce a simple framework that operates on 3D points of single objects or whole scenes coupled with category-agnostic large-scale training from diverse RGB-D videos.

3D Reconstruction Self-Supervised Learning +1

200
0.22 stars / hour

SensorX2car: Sensors-to-car calibration for autonomous driving in road scenarios

opencalib/sensorx2car 18 Jan 2023

To this end, we present SensorX2car, a calibration toolbox for the online calibration of sensor-to-car coordinate systems in road scenes.

Autonomous Driving

48
0.22 stars / hour

CodeGen: An Open Large Language Model for Code with Multi-Turn Program Synthesis

salesforce/CodeGen 25 Mar 2022

To democratize this, we train and release a family of large language models up to 16. 1B parameters, called CODEGEN, on natural language and programming language data, and open source the training library JAXFORMER.

Code Generation Language Modelling +1

2,058
0.21 stars / hour

Planar Object Tracking via Weighted Optical Flow

serycjon/WOFT 24 Jan 2023

We propose WOFT -- a novel method for planar object tracking that estimates a full 8 degrees-of-freedom pose, i. e. the homography w. r. t.

Object Tracking Optical Flow Estimation

36
0.20 stars / hour

Robust Speech Recognition via Large-Scale Weak Supervision

openai/whisper Preprint 2022

We study the capabilities of speech processing systems trained simply to predict large amounts of transcripts of audio on the internet.

Robust Speech Recognition speech-recognition

22,322
0.20 stars / hour

CROWDLAB: Supervised learning to infer consensus labels and quality scores for data with multiple annotators

cleanlab/cleanlab 13 Oct 2022

For analyzing such data, we introduce CROWDLAB, a straightforward approach to utilize any trained classifier to estimate: (1) A consensus label for each example that aggregates the available annotations; (2) A confidence score for how likely each consensus label is correct; (3) A rating for each annotator quantifying the overall correctness of their labels.

4,972
0.20 stars / hour

TIARA: Multi-grained Retrieval for Robust Question Answering over Large Knowledge Bases

microsoft/KC 24 Oct 2022

Pre-trained language models (PLMs) have shown their effectiveness in multiple scenarios.

Question Answering Retrieval

66
0.19 stars / hour

Synthcity: facilitating innovative use cases of synthetic data in different data modalities

vanderschaarlab/synthcity 18 Jan 2023

Synthcity is an open-source software package for innovative use cases of synthetic data in ML fairness, privacy and augmentation across diverse tabular data modalities, including static data, regular and irregular time series, data with censoring, multi-source data, composite data, and more.

Fairness Irregular Time Series +1

51
0.18 stars / hour

Text2Poster: Laying out Stylized Texts on Retrieved Images

chuhaojin/text2poster-icassp-22 6 Jan 2023

Poster generation is a significant task for a wide range of applications, which is often time-consuming and requires lots of manual editing and artistic experience.

Image Retrieval Layout Design +1

74
0.18 stars / hour