Trending Research

StoryDiffusion: Consistent Self-Attention for Long-Range Image and Video Generation

hvision-nku/storydiffusion • • 2 May 2024

This module converts the generated sequence of images into videos with smooth transitions and consistent subjects that are significantly more stable than the modules based on latent spaces only, especially in the context of long video generation.

motion prediction Story Generation +1

3,656

9.45 stars / hour

Paper
Code

DeepSeek-V2: A Strong, Economical, and Efficient Mixture-of-Experts Language Model

deepseek-ai/deepseek-v2 • • 7 May 2024

MLA guarantees efficient inference through significantly compressing the Key-Value (KV) cache into a latent vector, while DeepSeekMoE enables training strong models at an economical cost through sparse computation.

Language Modelling Reinforcement Learning (RL)

1,273

5.96 stars / hour

Paper
Code

Granite Code Models: A Family of Open Foundation Models for Code Intelligence

ibm-granite/granite-code-models • 7 May 2024

Increasingly, code LLMs are being integrated into software development environments to improve the productivity of human programmers, and LLM-based agents are beginning to show promise for handling complex tasks autonomously.

Code Generation Decoder

466

4.20 stars / hour

Paper
Code

KAN: Kolmogorov-Arnold Networks

kindxiaoming/pykan • • 30 Apr 2024

Inspired by the Kolmogorov-Arnold representation theorem, we propose Kolmogorov-Arnold Networks (KANs) as promising alternatives to Multi-Layer Perceptrons (MLPs).

9,554

3.22 stars / hour

Paper
Code

QServe: W4A8KV4 Quantization and System Co-design for Efficient LLM Serving

mit-han-lab/qserve • • 7 May 2024

The key insight driving QServe is that the efficiency of LLM serving on GPUs is critically influenced by operations on low-throughput CUDA cores.

Language Modelling Large Language Model +1

122

2.23 stars / hour

Paper
Code

Improving Diffusion Models for Virtual Try-on

yisol/IDM-VTON • • 8 Mar 2024

Finally, we present a customization method using a pair of person-garment images, which significantly improves fidelity and authenticity.

Ranked #1 on Virtual Try-on on VITON-HD

Virtual Try-on

2,064

1.38 stars / hour

Paper
Code

Prometheus 2: An Open Source Language Model Specialized in Evaluating Other Language Models

prometheus-eval/prometheus-eval • 2 May 2024

Proprietary LMs such as GPT-4 are often employed to assess the quality of responses from various LMs.

Language Modelling

409

1.35 stars / hour

Paper
Code

ImageInWords: Unlocking Hyper-Detailed Image Descriptions

google/imageinwords • 5 May 2024

To address these issues, we introduce ImageInWords (IIW), a carefully designed human-in-the-loop annotation framework for curating hyper-detailed image descriptions and a new dataset resulting from this process.

Specificity Text-to-Image Generation

1.25 stars / hour

Paper
Code

Plan-and-Solve Prompting: Improving Zero-Shot Chain-of-Thought Reasoning by Large Language Models

assafelovic/gpt-researcher • 6 May 2023

To address the calculation errors and improve the quality of generated reasoning steps, we extend PS prompting with more detailed instructions and derive PS+ prompting.

Math

9,767

1.34 stars / hour

Paper
Code

Inf-DiT: Upsampling Any-Resolution Image with Memory-Efficient Diffusion Transformer

thudm/inf-dit • 7 May 2024

However, due to a quadratic increase in memory during generating ultra-high-resolution images (e. g. 4096*4096), the resolution of generated images is often limited to 1024*1024.

Image Generation Super-Resolution

1.23 stars / hour

Paper
Code