This technical report introduces Docling, an easy to use, self-contained, MIT-licensed open-source package for PDF document conversion.
In this paper, we introduce Hunyuan-Large, which is currently the largest open-source Transformer-based mixture of experts model, with a total of 389 billion parameters and 52 billion activation parameters, capable of handling up to 256K tokens.
Adam is one of the most popular optimization algorithms in deep learning.
In response, we introduce TableGPT2, a model rigorously pre-trained and fine-tuned with over 593. 8K tables and 2. 36M high-quality query-table-output tuples, a scale of table-related data unprecedented in prior research.
In this work, we introduce OmniGen, a new diffusion model for unified image generation.
To alleviate this problem, we propose HtmlRAG, which uses HTML instead of plain text as the format of retrieved knowledge in RAG.
To evaluate MVSplat360's performance, we introduce a new benchmark using the challenging DL3DV-10K dataset, where MVSplat360 achieves superior visual quality compared to state-of-the-art methods on wide-sweeping or even 360{\deg} NVS tasks.
The widespread adoption of Transformer architectures in various data modalities has opened new avenues for the applications in molecular modeling.
It constructs a block-diagonal preconditioner where each block consists of a coarse Kronecker product approximation to full-matrix AdaGrad for each parameter of the neural network.
Specifically, WebRL incorporates 1) a self-evolving curriculum that generates new tasks from unsuccessful attempts, 2) a robust outcome-supervised reward model (ORM), and 3) adaptive reinforcement learning strategies to ensure consistent improvements.