Hunyuan-Large: An Open-Source MoE Model with 52 Billion Activated Parameters by Tencent

tencent/hunyuan-large 4 Nov 2024

In this paper, we introduce Hunyuan-Large, which is currently the largest open-source Transformer-based mixture of experts model, with a total of 389 billion parameters and 52 billion activation parameters, capable of handling up to 256K tokens.

Logical Reasoning Mathematical Problem-Solving

957
5.53 stars / hour

Docling Technical Report

DS4SD/docling 19 Aug 2024

This technical report introduces Docling, an easy to use, self-contained, MIT-licensed open-source package for PDF document conversion.

7,620
4.92 stars / hour

TableGPT2: A Large Multimodal Model with Tabular Data Integration

tablegpt/tablegpt-agent 4 Nov 2024

In response, we introduce TableGPT2, a model rigorously pre-trained and fine-tuned with over 593. 8K tables and 2. 36M high-quality query-table-output tuples, a scale of table-related data unprecedented in prior research.

Benchmarking Data Integration

132
2.47 stars / hour

ADOPT: Modified Adam Can Converge with Any $β_2$ with the Optimal Rate

ishohei220/adopt 5 Nov 2024

Adam is one of the most popular optimization algorithms in deep learning.

Image Classification

170
1.73 stars / hour

HtmlRAG: HTML is Better Than Plain Text for Modeling Retrieved Knowledge in RAG Systems

plageon/HtmlRAG 5 Nov 2024

To alleviate this problem, we propose HtmlRAG, which uses HTML instead of plain text as the format of retrieved knowledge in RAG.

Hallucination RAG +1

121
1.52 stars / hour

PromptFix: You Prompt and We Fix the Photo

yeates/promptfix 27 May 2024

To address these limitations, we propose PromptFix, a comprehensive framework that enables diffusion models to follow human instructions to perform a wide variety of image-processing tasks.

Denoising Image Generation +1

348
1.13 stars / hour

In-Context LoRA for Diffusion Transformers

ali-vilab/In-Context-LoRA 31 Oct 2024

While task-specific in terms of tuning data, our framework remains task-agnostic in architecture and pipeline, offering a powerful tool for the community and providing valuable insights for further research on product-level task-agnostic generation systems.

Image Generation

366
1.34 stars / hour

OmniGen: Unified Image Generation

vectorspacelab/omnigen 17 Sep 2024

In this work, we introduce OmniGen, a new diffusion model for unified image generation.

Edge Detection Pose Estimation +2

2,130
1.31 stars / hour

PiML Toolbox for Interpretable Machine Learning Model Development and Diagnostics

selfexplainml/piml-toolbox 7 May 2023

PiML (read $\pi$-ML, /`pai`em`el/) is an integrated and open-access Python toolbox for interpretable machine learning model development and model diagnostics.

Fairness Interpretable Machine Learning

1,196
1.30 stars / hour

MVSplat360: Feed-Forward 360 Scene Synthesis from Sparse Views

donydchen/mvsplat360 7 Nov 2024

To evaluate MVSplat360's performance, we introduce a new benchmark using the challenging DL3DV-10K dataset, where MVSplat360 achieves superior visual quality compared to state-of-the-art methods on wide-sweeping or even 360{\deg} NVS tasks.

3D Reconstruction Denoising +2

53
1.29 stars / hour