Hunyuan-Large: An Open-Source MoE Model with 52 Billion Activated Parameters by Tencent

tencent/hunyuan-large 4 Nov 2024

In this paper, we introduce Hunyuan-Large, which is currently the largest open-source Transformer-based mixture of experts model, with a total of 389 billion parameters and 52 billion activation parameters, capable of handling up to 256K tokens.

Logical Reasoning Mathematical Problem-Solving

873
13.23 stars / hour

Docling Technical Report

DS4SD/docling 19 Aug 2024

This technical report introduces Docling, an easy to use, self-contained, MIT-licensed open-source package for PDF document conversion.

5,728
6.28 stars / hour

Training-free Regional Prompting for Diffusion Transformers

antonioo-c/regional-prompting-flux 4 Nov 2024

Diffusion models have demonstrated excellent capabilities in text-to-image generation.

Text-to-Image Generation

234
5.24 stars / hour

In-Context LoRA for Diffusion Transformers

ali-vilab/In-Context-LoRA 31 Oct 2024

While task-specific in terms of tuning data, our framework remains task-agnostic in architecture and pipeline, offering a powerful tool for the community and providing valuable insights for further research on product-level task-agnostic generation systems.

Image Generation

335
2.26 stars / hour

GameGen-X: Interactive Open-world Game Video Generation

gamegen-x/gamegen-x 1 Nov 2024

To realize this vision, we first collected and built an Open-World Video Game Dataset from scratch.

Text-to-Video Generation Video Generation

109
1.93 stars / hour

LLaMA-Berry: Pairwise Optimization for O1-like Olympiad-Level Mathematical Reasoning

trotsky1997/mathblackbox 3 Oct 2024

This paper presents an advanced mathematical problem-solving framework, LLaMA-Berry, for enhancing the mathematical reasoning ability of Large Language Models (LLMs).

Efficient Exploration Mathematical Problem-Solving +1

894
1.58 stars / hour

OmniGen: Unified Image Generation

vectorspacelab/omnigen 17 Sep 2024

In this work, we introduce OmniGen, a new diffusion model for unified image generation.

Edge Detection Pose Estimation +2

1,973
1.54 stars / hour

PromptFix: You Prompt and We Fix the Photo

yeates/promptfix 27 May 2024

To address these limitations, we propose PromptFix, a comprehensive framework that enables diffusion models to follow human instructions to perform a wide variety of image-processing tasks.

Denoising Image Generation +1

311
1.46 stars / hour

DreamClear: High-Capacity Real-World Image Restoration with Privacy-Safe Dataset Curation

shallowdream204/dreamclear 24 Oct 2024

Our second contribution, DreamClear, is a DiT-based image restoration model.

Image Restoration

690
1.48 stars / hour

No Pose, No Problem: Surprisingly Simple 3D Gaussian Splats from Sparse Unposed Images

cvg/NoPoSplat 31 Oct 2024

We utilize the reconstructed 3D Gaussians for novel view synthesis and pose estimation tasks and propose a two-stage coarse-to-fine pipeline for accurate pose estimation.

3D Reconstruction Generalizable Novel View Synthesis +2

404
1.21 stars / hour