FramePainter: Endowing Interactive Image Editing with Video Diffusion Priors

ybybzhang/framepainter 14 Jan 2025

We highlight the effectiveness and efficiency of FramePainter across various of editing signals: it domainantly outperforms previous state-of-the-art methods with far less training data, achieving highly seamless and coherent editing of images, \eg, automatically adjust the reflection of the cup.

Image to Video Generation

227
3.67 stars / hour

Stretching Each Dollar: Diffusion Training from Scratch on a Micro-Budget

sonyresearch/micro_diffusion 22 Jul 2024

As scaling laws in generative AI push performance, they also simultaneously concentrate the development of these models among actors with large computational resources.

1,002
3.49 stars / hour

Tensor Product Attention Is All You Need

tensorgi/t6 11 Jan 2025

Scaling language models to handle longer input sequences typically necessitates large key-value (KV) caches, resulting in substantial memory overhead during inference.

Language Modeling Language Modelling

169
2.06 stars / hour

MiniRAG: Towards Extremely Simple Retrieval-Augmented Generation

hkuds/minirag 12 Jan 2025

The growing demand for efficient and lightweight Retrieval-Augmented Generation (RAG) systems has highlighted significant challenges when deploying Small Language Models (SLMs) in existing RAG frameworks.

RAG Retrieval

250
1.76 stars / hour

UnCommon Objects in 3D

facebookresearch/uco3d 13 Jan 2025

We introduce Uncommon Objects in 3D (uCO3D), a new object-centric dataset for 3D deep learning and 3D generative AI.

Object

187
1.73 stars / hour

The GAN is dead; long live the GAN! A Modern GAN Baseline

brownvc/r3gan 9 Jan 2025

There is a widely-spread claim that GANs are difficult to train, and GAN architectures in the literature are littered with empirical tricks.

Image Generation

543
1.72 stars / hour

SVFR: A Unified Framework for Generalized Video Face Restoration

wangzhiyaoo/svfr 2 Jan 2025

In this paper, we propose a novel approach for the Generalized Video Face Restoration (GVFR) task, which integrates video BFR, inpainting, and colorization tasks that we empirically show to benefit each other.

Colorization Representation Learning

541
1.62 stars / hour

Agentless: Demystifying LLM-based Software Engineering Agents

OpenAutoCoder/Agentless 1 Jul 2024

However, the complexity of these agent-based approaches, together with the limited abilities of current LLMs, raises the following question: Do we really have to employ complex autonomous software agents?

Program Repair

1,295
1.52 stars / hour

LlamaV-o1: Rethinking Step-by-step Visual Reasoning in LLMs

mbzuai-oryx/llamav-o1 10 Jan 2025

The benchmark presents a diverse set of challenges with eight different categories ranging from complex visual perception to scientific reasoning with over 4k reasoning steps in total, enabling robust evaluation of LLMs' abilities to perform accurate and interpretable visual reasoning across multiple steps.

4k Visual Reasoning

163
1.47 stars / hour

KAG: Boosting LLMs in Professional Domains via Knowledge Augmented Generation

openspg/kag 10 Sep 2024

The recently developed retrieval-augmented generation (RAG) technology has enabled the efficient construction of domain-specific applications.

Knowledge Graphs Question Answering +2

4,330
1.23 stars / hour