Trending Research

AutoCodeRover: Autonomous Program Improvement

nus-apr/auto-code-rover • 8 Apr 2024

Recent progress in Large Language Models (LLMs) has significantly impacted the development process, where developers can use LLM-based programming assistants to achieve automated coding.

Bug fixing Code Search +1

1,745

6.56 stars / hour

Paper
Code

Assisting in Writing Wikipedia-like Articles From Scratch with Large Language Models

stanford-oval/storm • 22 Feb 2024

We study how to apply large language models to write grounded and organized long-form articles from scratch, with comparable breadth and depth to Wikipedia pages.

Retrieval

1,520

4.62 stars / hour

Paper
Code

MagicTime: Time-lapse Video Generation Models as Metamorphic Simulators

pku-yuangroup/magictime • • 7 Apr 2024

Recent advances in Text-to-Video generation (T2V) have achieved remarkable success in synthesizing high-quality general videos from textual descriptions.

Text-to-Video Generation Video Generation

818

2.74 stars / hour

Paper
Code

InstantMesh: Efficient 3D Mesh Generation from a Single Image with Sparse-view Large Reconstruction Models

tencentarc/instantmesh • • 10 Apr 2024

We present InstantMesh, a feed-forward framework for instant 3D mesh generation from a single image, featuring state-of-the-art generation quality and significant training scalability.

Image to 3D

326

2.16 stars / hour

Paper
Code

Patch n' Pack: NaViT, a Vision Transformer for any Aspect Ratio and Resolution

pku-yuangroup/open-sora-plan • • 12 Jul 2023

The ubiquitous and demonstrably suboptimal choice of resizing images to a fixed resolution before processing them with computer vision models has not yet been successfully challenged.

Fairness Image Classification +5

9,832

2.04 stars / hour

Paper
Code

LLaVA-UHD: an LMM Perceiving Any Aspect Ratio and High-Resolution Images

openbmb/omnilmm • • 18 Mar 2024

To address the challenges, we present LLaVA-UHD, a large multimodal model that can efficiently perceive images in any aspect ratio and high resolution.

752

1.64 stars / hour

Paper
Code

Rho-1: Not All Tokens Are What You Need

microsoft/rho • 11 Apr 2024

After fine-tuning, Rho-1-1B and 7B achieved state-of-the-art results of 40. 6% and 51. 8% on MATH dataset, respectively - matching DeepSeekMath with only 3% of the pretraining tokens.

Continual Pretraining Language Modelling +1

151

1.57 stars / hour

Paper
Code

InstantStyle: Free Lunch towards Style-Preserving in Text-to-Image Generation

instantstyle/instantstyle • • 3 Apr 2024

Tuning-free diffusion-based models have demonstrated significant potential in the realm of image personalization and customization.

Text-to-Image Generation

924

1.54 stars / hour

Paper
Code

Visual Autoregressive Modeling: Scalable Image Generation via Next-Scale Prediction

FoundationVision/VAR • • 3 Apr 2024

We present Visual AutoRegressive modeling (VAR), a new generation paradigm that redefines the autoregressive learning on images as coarse-to-fine "next-scale prediction" or "next-resolution prediction", diverging from the standard raster-scan "next-token prediction".

Ranked #6 on Image Generation on ImageNet 256x256

Image Generation Language Modelling +2

2,133

1.28 stars / hour

Paper
Code

LLM2Vec: Large Language Models Are Secretly Powerful Text Encoders

mcgill-nlp/llm2vec • • 9 Apr 2024

We outperform encoder-only models by a large margin on word-level tasks and reach a new unsupervised state-of-the-art performance on the Massive Text Embeddings Benchmark (MTEB).

Contrastive Learning

179

1.12 stars / hour

Paper
Code