Trending Research

Assisting in Writing Wikipedia-like Articles From Scratch with Large Language Models

stanford-oval/storm • 22 Feb 2024

We study how to apply large language models to write grounded and organized long-form articles from scratch, with comparable breadth and depth to Wikipedia pages.

Retrieval

3,745

6.36 stars / hour

Paper
Code

Mini-Gemini: Mining the Potential of Multi-modality Vision Language Models

dvlab-research/minigemini • • 27 Mar 2024

We try to narrow the gap by mining the potential of VLMs for better performance and any-to-any workflow from three aspects, i. e., high-resolution visual tokens, high-quality data, and VLM-guided generation.

Ranked #8 on Visual Question Answering on MM-Vet

Image Comprehension Visual Dialog +1

2,792

5.79 stars / hour

Paper
Code

InstantMesh: Efficient 3D Mesh Generation from a Single Image with Sparse-view Large Reconstruction Models

tencentarc/instantmesh • • 10 Apr 2024

We present InstantMesh, a feed-forward framework for instant 3D mesh generation from a single image, featuring state-of-the-art generation quality and significant training scalability.

Image to 3D

1,226

3.23 stars / hour

Paper
Code

LM Transparency Tool: Interactive Tool for Analyzing Transformer Language Models

facebookresearch/llm-transparency-tool • • 10 Apr 2024

We present the LM Transparency Tool (LM-TT), an open-source interactive toolkit for analyzing the internal workings of Transformer-based language models.

Decision Making

567

3.10 stars / hour

Paper
Code

Magic Clothing: Controllable Garment-Driven Image Synthesis

shinechen1024/magicclothing • • 15 Apr 2024

We propose Magic Clothing, a latent diffusion model (LDM)-based network architecture for an unexplored garment-driven image synthesis task.

Image Generation

849

2.10 stars / hour

Paper
Code

Solving Data Quality Problems with Desbordante: a Demo

mstrutov/desbordante • 27 Jul 2023

However, most existing data profiling systems that focus on complex statistics do not provide proper integration with the tools used by contemporary data scientists.

Anomaly Detection Descriptive

315

1.94 stars / hour

Paper
Code

Megalodon: Efficient LLM Pretraining and Inference with Unlimited Context Length

xuezhemax/megalodon • • 12 Apr 2024

The quadratic complexity and weak length extrapolation of Transformers limits their ability to scale to long sequences, and while sub-quadratic solutions like linear attention and state space models exist, they empirically underperform Transformers in pretraining efficiency and downstream task accuracy.

290

1.83 stars / hour

Paper
Code

Actions Speak Louder than Words: Trillion-Parameter Sequential Transducers for Generative Recommendations

facebookresearch/generative-recommenders • • 27 Feb 2024

Large-scale recommendation systems are characterized by their reliance on high cardinality, heterogeneous features and the need to handle tens of billions of user actions on a daily basis.

Ranked #1 on Recommendation Systems on MovieLens 20M (HR@10 (full corpus) metric)

Recommendation Systems

146

1.79 stars / hour

Paper
Code

MyGO: Discrete Modality Information as Fine-Grained Tokens for Multi-modal Knowledge Graph Completion

zjukg/mygo • • 15 Apr 2024

To overcome their inherent incompleteness, multi-modal knowledge graph completion (MMKGC) aims to discover unobserved knowledge from given MMKGs, leveraging both structural information from the triples and multi-modal information of the entities.

Contrastive Learning Descriptive +3

163

1.32 stars / hour

Paper
Code

Visual Autoregressive Modeling: Scalable Image Generation via Next-Scale Prediction

FoundationVision/VAR • • 3 Apr 2024

We present Visual AutoRegressive modeling (VAR), a new generation paradigm that redefines the autoregressive learning on images as coarse-to-fine "next-scale prediction" or "next-resolution prediction", diverging from the standard raster-scan "next-token prediction".

Ranked #7 on Image Generation on ImageNet 256x256

Image Generation Language Modelling +2

2,761

1.32 stars / hour

Paper
Code