VACE: All-in-One Video Creation and Editing

ali-vilab/vace 10 Mar 2025

Further pursuing the unification of generation and editing tasks has yielded significant progress in the domain of image content creation.

All Video Editing +1

2,661
0.47 stars / hour

IndexTTS: An Industrial-Level Controllable and Efficient Zero-Shot Text-To-Speech System

index-tts/index-tts 8 Feb 2025

Recently, large language model (LLM) based text-to-speech (TTS) systems have gradually become the mainstream in the industry due to their high naturalness and powerful zero-shot voice cloning capabilities. Here, we introduce the IndexTTS system, which is mainly based on the XTTS and Tortoise model.

Decoder Language Modeling +6

2,787
0.45 stars / hour

MASLab: A Unified and Comprehensive Codebase for LLM-based Multi-Agent Systems

masworks/maslab 22 May 2025

To address these challenges, we introduce MASLab, a unified, comprehensive, and research-friendly codebase for LLM-based MAS.

121
0.44 stars / hour

SEW: Self-Evolving Agentic Workflows for Automated Code Generation

evoagentx/evoagentx 24 May 2025

Large Language Models (LLMs) have demonstrated effectiveness in code generation tasks.

Code Generation

906
0.42 stars / hour

Reservoir-enhanced Segment Anything Model for Subsurface Diagnosis

zhouxr6066/Res-SAM 26 Apr 2025

Urban roads and infrastructure, vital to city operations, face growing threats from subsurface anomalies like cracks and cavities.

Anomaly Detection GPR +1

593
0.40 stars / hour

TabM: Advancing Tabular Deep Learning with Parameter-Efficient Ensembling

yandex-research/tabm 31 Oct 2024

Deep learning architectures for supervised learning on tabular data range from simple multilayer perceptrons (MLP) to sophisticated Transformers and retrieval-augmented methods.

Deep Learning Retrieval

397
0.39 stars / hour

SkyReels-V2: Infinite-length Film Generative Model

skyworkai/skyreels-v2 17 Apr 2025

Recent advances in video generation have been driven by diffusion models and autoregressive frameworks, yet critical challenges persist in harmonizing prompt adherence, visual quality, motion dynamics, and duration: compromises in motion dynamics to enhance temporal visual quality, constrained video duration (5-10 seconds) to prioritize resolution, and inadequate shot-aware generation stemming from general-purpose MLLMs' inability to interpret cinematic grammar, such as shot composition, actor expressions, and camera motions.

Large Language Model model +2

3,145
0.39 stars / hour

Advanced long-term earth system forecasting by learning the small-scale nature

easylearningscores/triton_ai4earth 26 May 2025

Reliable long-term forecast of Earth system dynamics is heavily hampered by instabilities in current AI models during extended autoregressive simulations.

253
0.37 stars / hour

AutoSchemaKG: Autonomous Knowledge Graph Construction through Dynamic Schema Induction from Web-Scale Corpora

hkust-knowcomp/autoschemakg 29 May 2025

We present AutoSchemaKG, a framework for fully autonomous knowledge graph construction that eliminates the need for predefined schemas.

graph construction Knowledge Graphs

76
0.37 stars / hour

Urban1960SatSeg: Unsupervised Semantic Segmentation of Mid-20$^{th}$ century Urban Landscapes with Satellite Imageries

tianxiang-hao/urban1960satseg 11 Jun 2025

First, $\textbf{Urban1960SatBench}$ serves as a novel, expertly annotated semantic segmentation dataset built on mid-20$^{th}$ century Keyhole imagery, covering 1, 240 km$^2$ and key urban classes (buildings, roads, farmland, water).

Segmentation Self-Supervised Learning +1

53
0.34 stars / hour