PhysX: Physical-Grounded 3D Asset Generation

ziangcao0312/PhysX 16 Jul 2025

3D modeling is moving from virtual to physical.

3D Generation Image to 3D

111
2.27 stars / hour

WebSailor: Navigating Super-human Reasoning for Web Agent

alibaba-nlp/webagent 3 Jul 2025

Transcending human cognitive limitations represents a critical frontier in LLM training.

4,587
1.57 stars / hour

SpatialTrackerV2: 3D Point Tracking Made Easy

henry123-boy/SpaTrackerV2 17 Jul 2025

We present SpatialTrackerV2, a feed-forward 3D point tracking method for monocular videos.

3D Reconstruction Camera Pose Estimation +2

531
1.22 stars / hour

REAL: Benchmarking Autonomous Agents on Deterministic Simulations of Real Websites

agi-inc/agisdk 15 Apr 2025

We introduce REAL, a benchmark and framework for multi-turn agent evaluations on deterministic simulations of real-world websites.

Autonomous Web Navigation Benchmarking +2

232
1.12 stars / hour

IndexTTS: An Industrial-Level Controllable and Efficient Zero-Shot Text-To-Speech System

index-tts/index-tts 8 Feb 2025

Recently, large language model (LLM) based text-to-speech (TTS) systems have gradually become the mainstream in the industry due to their high naturalness and powerful zero-shot voice cloning capabilities. Here, we introduce the IndexTTS system, which is mainly based on the XTTS and Tortoise model.

Decoder Language Modeling +6

3,810
1.10 stars / hour

Let Them Talk: Audio-Driven Multi-Person Conversational Video Generation

meigen-ai/multitalk 28 May 2025

Audio-driven human animation methods, such as talking head and talking body generation, have made remarkable progress in generating synchronized facial movements and appealing visual quality videos.

Human Animation Instruction Following +1

1,446
0.71 stars / hour

No time to train! Training-Free Reference-Based Instance Segmentation

miquel-espinosa/no-time-to-train 3 Jul 2025

The performance of image segmentation models has historically been constrained by the high cost of collecting large-scale annotated data.

Cross-Domain Few-Shot Object Detection Image Segmentation +3

103
0.69 stars / hour

Zep: A Temporal Knowledge Graph Architecture for Agent Memory

getzep/graphiti 20 Jan 2025

We introduce Zep, a novel memory layer service for AI agents that outperforms the current state-of-the-art system, MemGPT, in the Deep Memory Retrieval (DMR) benchmark.

RAG Retrieval +1

14,178
0.64 stars / hour

Mathematical Introduction to Deep Learning: Methods, Implementations, and Theory

introdeeplearning/book 31 Oct 2023

This book aims to provide an introduction to the topic of deep learning algorithms.

Deep Learning

210
0.60 stars / hour

CGVQM+D: Computer Graphics Video Quality Metric and Dataset

intellabs/cgvqm 13 Jun 2025

While existing video and image quality datasets have extensively studied natural videos and traditional distortions, the perception of synthetic content and modern rendering artifacts remains underexplored.

Denoising Novel View Synthesis

67
0.56 stars / hour