MASKSEARCH: A Universal Pre-Training Framework to Enhance Agentic Search Capability

alibaba-nlp/masksearch 26 May 2025

In the pre-training stage, we introduce the Retrieval Augmented Mask Prediction (RAMP) task, where the model learns to leverage search tools to fill masked spans on a large number of pre-training data, thus acquiring universal retrieval and reasoning capabilities for LLMs.

Multi-hop Question Answering Question Answering +2

115
0.67 stars / hour

MagCache: Fast Video Generation with Magnitude-Aware Cache

Zehong-Ma/ComfyUI-MagCache 10 Jun 2025

Existing acceleration techniques for video diffusion models often rely on uniform heuristics or time-embedding variants to skip timesteps and reuse cached features.

SSIM Video Generation

68
0.66 stars / hour

Spiking Graph Convolutional Networks

zulunzhu/spikinggcn 5 May 2022

Graph Convolutional Networks (GCNs) achieve an impressive performance due to the remarkable representation ability in learning the graph information.

Graph Classification Recommendation Systems

103
0.65 stars / hour

SurveyForge: On the Outline Heuristics, Memory-Driven Generation, and Multi-dimensional Evaluation for Automated Survey Writing

alpha-innovator/surveyforge 6 Mar 2025

Survey paper plays a crucial role in scientific research, especially given the rapid growth of research publications.

Survey

223
0.61 stars / hour

SEW: Self-Evolving Agentic Workflows for Automated Code Generation

evoagentx/evoagentx 24 May 2025

Large Language Models (LLMs) have demonstrated effectiveness in code generation tasks.

Code Generation

833
0.61 stars / hour

RWKV-7 "Goose" with Expressive Dynamic State Evolution

fla-org/flash-linear-attention 18 Mar 2025

We present RWKV-7 "Goose", a new sequence modeling architecture with constant memory usage and constant inference time per token.

In-Context Learning Language Modeling +1

2,688
0.59 stars / hour

Dolphin: Document Image Parsing via Heterogeneous Anchor Prompting

bytedance/dolphin 20 May 2025

Document image parsing is challenging due to its complexly intertwined elements such as text paragraphs, figures, formulas, and tables.

1,548
0.56 stars / hour

DeltaProduct: Improving State-Tracking in Linear RNNs via Householder Products

sustcsonglin/flash-linear-attention 14 Feb 2025

To address this, recent architectures such as DeltaNet and RWKV-7 adopted a diagonal plus rank-1 structure, which allows simultaneous token and channel mixing, improving associative recall and, as recently shown, state-tracking when allowing negative eigenvalues in the state-transition matrices.

Language Modeling Language Modelling +1

2,689
0.54 stars / hour

Paper2Poster: Towards Multimodal Poster Automation from Scientific Papers

paper2poster/paper2poster 27 May 2025

To address this challenge, we introduce the first benchmark and metric suite for poster generation, which pairs recent conference papers with author-designed posters and evaluates outputs on (i)Visual Quality-semantic alignment with human posters, (ii)Textual Coherence-language fluency, (iii)Holistic Assessment-six fine-grained aesthetic and informational criteria scored by a VLM-as-judge, and notably (iv)PaperQuiz-the poster's ability to convey core paper content as measured by VLMs answering generated quizzes.

2,019
0.52 stars / hour

QUEEN: QUantized Efficient ENcoding of Dynamic Gaussians for Streaming Free-viewpoint Videos

nvlabs/queen 5 Dec 2024

Online free-viewpoint video (FVV) streaming is a challenging problem, which is relatively under-explored.

Attribute Quantization

89
0.52 stars / hour