World Model on Million-Length Video And Language With RingAttention

LargeWorldModel/LWM 13 Feb 2024

This work paves the way for training on massive datasets of long video and language to develop understanding of both human knowledge and the multimodal world, and broader capabilities.

Video Understanding

6,082
0.86 stars / hour

Differential Diffusion: Giving Each Pixel Its Strength

exx8/differential-diffusion 1 Jun 2023

With the rise of diffusion models, image editing via textual instructions has become ubiquitous.

Text-based Image Editing

119
0.82 stars / hour

TinyLLaVA: A Framework of Small-scale Large Multimodal Models

dlcv-buaa/tinyllavabench 22 Feb 2024

We present the TinyLLaVA framework that provides a unified perspective in designing and analyzing the small-scale Large Multimodal Models (LMMs).

Visual Question Answering

73
0.70 stars / hour

SiT: Exploring Flow and Diffusion-based Generative Models with Scalable Interpolant Transformers

willisma/sit 16 Jan 2024

We present Scalable Interpolant Transformers (SiT), a family of generative models built on the backbone of Diffusion Transformers (DiT).

Image Generation

363
0.70 stars / hour

T-Stitch: Accelerating Sampling in Pre-Trained Diffusion Models with Trajectory Stitching

nvlabs/t-stitch 21 Feb 2024

Sampling from diffusion probabilistic models (DPMs) is often expensive for high-quality image generation and typically requires many steps with a large model.

Image Generation

63
0.68 stars / hour

Data Engineering for Scaling Language Models to 128K Context

franxyao/long-context-data-engineering 15 Feb 2024

We demonstrate that continual pretraining of the full model on 1B-5B tokens of such data is an effective and affordable strategy for scaling the context length of language models to 128K.

Continual Pretraining

202
0.62 stars / hour

Vectorized and performance-portable Quicksort

google/highway 12 May 2022

Recent works showed that implementations of Quicksort using vector CPU instructions can outperform the non-vectorized algorithms in widespread use.

3,378
0.61 stars / hour

AlphaFold Meets Flow Matching for Generating Protein Ensembles

bjing2016/alphaflow 7 Feb 2024

When trained and evaluated on the PDB, our method provides a superior combination of precision and diversity compared to AlphaFold with MSA subsampling.

139
0.59 stars / hour

Large Language Models for Data Annotation: A Survey

zhen-tan-dmml/llm4annotation 21 Feb 2024

Furthermore, the paper includes an in-depth taxonomy of methodologies employing LLMs for data annotation, a comprehensive review of learning strategies for models incorporating LLM-generated annotations, and a detailed discussion on primary challenges and limitations associated with using LLMs for data annotation.

91
0.56 stars / hour

Towards Building Multilingual Language Model for Medicine

magic-ai4med/mmedlm 21 Feb 2024

In this paper, we aim to develop an open-source, multilingual language model for medicine, that the benefits a wider, linguistically diverse audience from different regions.

Language Modelling Question Answering

70
0.56 stars / hour