Nexus-Gen: A Unified Model for Image Understanding, Generation, and Editing

modelscope/nexus-gen 30 Apr 2025

To bridge this gap, we present Nexus-Gen, a unified model that synergizes the language reasoning capabilities of LLMs with the image synthesis power of diffusion models.

Image Generation

180
0.38 stars / hour

TimeBridge: Non-Stationarity Matters for Long-term Time Series Forecasting

hank0626/timebridge 6 Oct 2024

Eliminating non-stationarity is essential for avoiding spurious regressions and capturing local dependencies in short-term modeling, while preserving it is crucial for revealing long-term cointegration across variates.

Multivariate Time Series Forecasting Time Series

133
0.37 stars / hour

INTELLECT-2: A Reasoning Model Trained Through Globally Decentralized Reinforcement Learning

PrimeIntellect-ai/prime-rl 12 May 2025

We introduce INTELLECT-2, the first globally distributed reinforcement learning (RL) training run of a 32 billion parameter language model.

reinforcement-learning Reinforcement Learning +1

252
0.35 stars / hour

FG-CLIP: Fine-Grained Visual and Textual Alignment

360cvgroup/fg-clip 8 May 2025

Contrastive Language-Image Pre-training (CLIP) excels in multimodal tasks such as image-text retrieval and zero-shot classification but struggles with fine-grained understanding due to its focus on coarse-grained short captions.

Image-text Retrieval object-detection +4

98
0.35 stars / hour

Bitnet.cpp: Efficient Edge Inference for Ternary LLMs

microsoft/bitnet 17 Feb 2025

The advent of 1-bit large language models (LLMs), led by BitNet b1. 58, has spurred interest in ternary LLMs.

19,619
0.32 stars / hour

FlashInfer: Efficient and Customizable Attention Engine for LLM Inference Serving

flashinfer-ai/flashinfer 2 Jan 2025

We present FlashInfer: a customizable and efficient attention engine for LLM serving.

Scheduling

2,984
0.31 stars / hour

IndexTTS: An Industrial-Level Controllable and Efficient Zero-Shot Text-To-Speech System

index-tts/index-tts 8 Feb 2025

Recently, large language model (LLM) based text-to-speech (TTS) systems have gradually become the mainstream in the industry due to their high naturalness and powerful zero-shot voice cloning capabilities. Here, we introduce the IndexTTS system, which is mainly based on the XTTS and Tortoise model.

Decoder Language Modeling +6

1,776
0.30 stars / hour

InstantCharacter: Personalize Any Characters with a Scalable Diffusion Transformer Framework

tencent/instantcharacter 16 Apr 2025

Third, to effectively train the framework, we construct a large-scale character dataset containing 10-million-level samples.

Image Generation

948
0.29 stars / hour

Large Language Model Agent: A Survey on Methodology, Applications and Challenges

luo-junyu/awesome-agent-papers 27 Mar 2025

The era of intelligent agents is upon us, driven by revolutionary advancements in large language models.

Language Modeling Language Modelling +1

673
0.29 stars / hour

3D Scene Generation: A Survey

hzxie/awesome-3d-scene-generation 8 May 2025

Recent advances in deep generative models (e. g., GANs, diffusion models) and 3D representations (e. g., NeRF, 3D Gaussians) have enabled the learning of real-world scene distributions, improving fidelity, diversity, and view consistency.

Autonomous Driving Diversity +3

306
0.28 stars / hour