LAVIS: A Library for Language-Vision Intelligence

salesforce/lavis 15 Sep 2022

We introduce LAVIS, an open-source deep learning library for LAnguage-VISion research and applications.

Image Captioning Image Retrieval +6

0.67 stars / hour

Protein structure generation via folding diffusion

microsoft/foldingdiff 30 Sep 2022

The ability to computationally generate novel yet physically foldable protein structures could lead to new biological discoveries and new treatments targeting yet incurable diseases.

Denoising Protein Structure Prediction

0.67 stars / hour

Robust Speech Recognition via Large-Scale Weak Supervision

openai/whisper Preprint 2022

We study the capabilities of speech processing systems trained simply to predict large amounts of transcripts of audio on the internet.

Robust Speech Recognition

0.55 stars / hour


facebookresearch/rl ICML 2018

A modular, primitive-first, python-first PyTorch library for Reinforcement Learning.

Continuous Control Decision Making +2

0.46 stars / hour

High-Resolution Image Synthesis with Latent Diffusion Models

compvis/stable-diffusion CVPR 2022

By decomposing the image formation process into a sequential application of denoising autoencoders, diffusion models (DMs) achieve state-of-the-art synthesis results on image data and beyond.

Denoising Image Inpainting +3

0.45 stars / hour


towhee-io/towhee 22 Oct 2020

Towhee is a framework that is dedicated to making neural data processing pipelines simple and fast.

Audio Fingerprint Contrastive Learning +1

0.41 stars / hour

Mega: Moving Average Equipped Gated Attention

facebookresearch/mega 21 Sep 2022

The design choices in the Transformer attention mechanism, including weak inductive bias and quadratic computational complexity, have limited its application for modeling long sequences.

Image Classification +3

0.38 stars / hour

VToonify: Controllable High-Resolution Portrait Video Style Transfer

williamyang1991/vtoonify 22 Sep 2022

Although a series of successful portrait image toonification models built upon the powerful StyleGAN have been proposed, these image-oriented methods have obvious limitations when applied to videos, such as the fixed frame size, the requirement of face alignment, missing non-facial details and temporal inconsistency.

Face Alignment Style Transfer +1

0.34 stars / hour

EditEval: An Instruction-Based Benchmark for Text Improvements

facebookresearch/editeval 27 Sep 2022

Evaluation of text generation to date has primarily focused on content created sequentially, rather than improvements on a piece of text.

Text Generation

0.34 stars / hour
0.32 stars / hour