Robust Speech Recognition via Large-Scale Weak Supervision

ggerganov/whisper.cpp Preprint 2022

We study the capabilities of speech processing systems trained simply to predict large amounts of transcripts of audio on the internet.

 Ranked #1 on Speech Recognition on Common Voice Italian (using extra training data)

Robust Speech Recognition speech-recognition

30,800
0.69 stars / hour

Pre-training Small Base LMs with Fewer Tokens

Lightning-AI/lit-gpt 12 Apr 2024

Here we show that smaller LMs trained utilizing some of the layers of GPT2-medium (355M) and GPT-2-large (770M) can effectively match the val loss of their bigger counterparts when trained from scratch for the same number of training steps on OpenWebText dataset with 9B tokens.

Language Modelling

6,498
0.75 stars / hour

Joint Physical-Digital Facial Attack Detection Via Simulating Spoofing Clues

FaceOnLive/Face-Liveness-Detection-SDK-Linux 12 Apr 2024

SPSC and SDSC augment live samples into simulated attack samples by simulating spoofing clues of physical and digital attacks, respectively, which significantly improve the capability of the model to detect "unseen" attack types.

Data Augmentation Face Anti-Spoofing +1

180
0.73 stars / hour

InstantStyle: Free Lunch towards Style-Preserving in Text-to-Image Generation

instantstyle/instantstyle 3 Apr 2024

Tuning-free diffusion-based models have demonstrated significant potential in the realm of image personalization and customization.

Text-to-Image Generation

1,032
0.67 stars / hour

AutoCodeRover: Autonomous Program Improvement

nus-apr/auto-code-rover 8 Apr 2024

Recent progress in Large Language Models (LLMs) has significantly impacted the development process, where developers can use LLM-based programming assistants to achieve automated coding.

Bug fixing Code Search +1

1,893
0.65 stars / hour

From Words to Numbers: Your Large Language Model Is Secretly A Capable Regressor When Given In-Context Examples

robertvacareanu/llm4regression 11 Apr 2024

We analyze how well pre-trained large language models (e. g., Llama2, GPT-4, Claude 3, etc) can do linear and non-linear regression when given in-context examples, without any additional training or gradient updates.

Language Modelling Large Language Model +1

86
0.60 stars / hour

SchurVINS: Schur Complement-Based Lightweight Visual Inertial Navigation System

bytedance/schurvins 4 Dec 2023

To this end, we propose a novel filter-based VINS framework named SchurVINS, which could guarantee both high accuracy by building a complete residual model and low computational complexity with Schur complement.

Computational Efficiency

233
0.56 stars / hour

TSMixer: Lightweight MLP-Mixer Model for Multivariate Time Series Forecasting

ibm/tsfm 14 Jun 2023

TSMixer outperforms state-of-the-art MLP and Transformer models in forecasting by a considerable margin of 8-60%.

Multivariate Time Series Forecasting Representation Learning +2

144
0.56 stars / hour

SafeGen: Mitigating Unsafe Content Generation in Text-to-Image Models

letterligo/text-agnostic-governance 10 Apr 2024

The key idea is to eliminate unsafe visual representations from the model regardless of the text input.

85
0.54 stars / hour

LLM2Vec: Large Language Models Are Secretly Powerful Text Encoders

mcgill-nlp/llm2vec 9 Apr 2024

We outperform encoder-only models by a large margin on word-level tasks and reach a new unsupervised state-of-the-art performance on the Massive Text Embeddings Benchmark (MTEB).

Contrastive Learning

230
0.53 stars / hour