DocETL: Agentic Query Rewriting and Evaluation for Complex Document Processing

ucbepic/docetl 16 Oct 2024

We introduce {\em (i)} logical rewriting of pipelines, tailored for LLM-based tasks, {\em (ii)} an agent-guided plan evaluation mechanism that synthesizes and orchestrates task-specific validation prompts, and {\em (iii)} an optimization algorithm that efficiently finds promising plans, considering the time constraints of LLM-based plan generation and evaluation.

1,199
0.94 stars / hour

LightRAG: Simple and Fast Retrieval-Augmented Generation

hkuds/lightrag 8 Oct 2024

Retrieval-Augmented Generation (RAG) systems enhance large language models (LLMs) by integrating external knowledge sources, enabling more accurate and contextually relevant responses tailored to user needs.

Information Retrieval RAG +1

6,780
0.92 stars / hour

PF3plat: Pose-Free Feed-Forward 3D Gaussian Splatting

cvlab-kaist/PF3plat 29 Oct 2024

We then introduce lightweight, learnable modules to refine depth and pose estimates from the coarse alignments, improving the quality of 3D reconstruction and novel view synthesis.

3D Reconstruction Monocular Depth Estimation +1

54
0.92 stars / hour

F5-TTS: A Fairytaler that Fakes Fluent and Faithful Speech with Flow Matching

SWivid/F5-TTS 9 Oct 2024

This sampling strategy for flow step can be easily applied to existing flow matching based models without retraining.

Denoising Text to Speech

6,261
0.89 stars / hour

Blendify -- Python rendering framework for Blender

ptrvilya/blendify 23 Oct 2024

With the rapid growth of the volume of research fields like computer vision and computer graphics, researchers require effective and user-friendly rendering tools to visualize results.

10-shot image generation

739
0.85 stars / hour

Grounding Image Matching in 3D with MASt3R

naver/mast3r 14 Jun 2024

Image Matching is a core component of all best-performing algorithms and pipelines in 3D vision.

3D Reconstruction

1,262
0.84 stars / hour

ServerlessLLM: Low-Latency Serverless Inference for Large Language Models

serverlessllm/serverlessllm 25 Jan 2024

This paper presents ServerlessLLM, a distributed system designed to support low-latency serverless inference for Large Language Models (LLMs).

Scheduling

317
0.80 stars / hour

OGBench: Benchmarking Offline Goal-Conditioned RL

seohongpark/ogbench 26 Oct 2024

Despite the importance of this setting, we lack a standard benchmark that can systematically evaluate the capabilities of offline GCRL algorithms.

Benchmarking reinforcement-learning +2

65
0.80 stars / hour

Allegro: Open the Black Box of Commercial-Level Video Generation Model

rhymes-ai/allegro 20 Oct 2024

Significant advancements have been made in the field of video generation, with the open-source community contributing a wealth of research papers and tools for training high-quality models.

Video Generation

499
0.78 stars / hour

Hallo2: Long-Duration and High-Resolution Audio-Driven Portrait Image Animation

fudan-generative-vision/hallo2 10 Oct 2024

To the best of our knowledge, Hallo2, proposed in this paper, is the first method to achieve 4K resolution and generate hour-long, audio-driven portrait image animations enhanced with textual prompts.

4k Image Animation +2

3,276
0.76 stars / hour