Advances and Challenges in Foundation Agents: From Brain-Inspired Intelligence to Evolutionary, Collaborative, and Safe Systems

foundationagents/awesome-foundation-agents 31 Mar 2025

The advent of large language models (LLMs) has catalyzed a transformative shift in artificial intelligence, paving the way for advanced intelligent agents capable of sophisticated reasoning, robust perception, and versatile action across diverse domains.

 Ranked #1 on Continual Learning on AIDS (using extra training data)

AutoML Continual Learning

774
0.67 stars / hour

NdLinear Is All You Need for Representation Learning

ensemble-core/ndlinear 21 Mar 2025

We propose NdLinear as a drop-in replacement for standard linear layers -- marking an important step toward next-generation neural architectures.

All Representation Learning

185
0.63 stars / hour

IndexTTS: An Industrial-Level Controllable and Efficient Zero-Shot Text-To-Speech System

index-tts/index-tts 8 Feb 2025

Recently, large language model (LLM) based text-to-speech (TTS) systems have gradually become the mainstream in the industry due to their high naturalness and powerful zero-shot voice cloning capabilities. Here, we introduce the IndexTTS system, which is mainly based on the XTTS and Tortoise model.

Decoder Language Modeling +5

959
0.57 stars / hour

Audio-visual Controlled Video Diffusion with Masked Selective State Spaces Modeling for Natural Talking Head Generation

harlanhong/actalker 3 Apr 2025

To this end, we introduce \textbf{ACTalker}, an end-to-end video diffusion framework that supports both multi-signals control and single-signal control for talking head video generation.

Mamba Talking Head Generation +1

210
0.53 stars / hour

Pushing the Limits of Large Language Model Quantization via the Linearity Theorem

hanguo97/flute 26 Nov 2024

Quantizing large language models has become a standard way to reduce their memory and computational costs.

Language Modeling Language Modelling +2

347
0.52 stars / hour

Advanced Video Inpainting Using Optical Flow-Guided Efficient Diffusion

nevsnev/fgdvi 1 Dec 2024

Specifically, FloED employs a dual-branch architecture, where a flow branch first restores corrupted flow and a multi-scale flow adapter provides motion guidance to the main inpainting branch.

Denoising Optical Flow Estimation +1

115
0.46 stars / hour

LocAgent: Graph-Guided LLM Agents for Code Localization

gersteinlab/locagent 12 Mar 2025

By parsing codebases into directed heterogeneous graphs, LocAgent creates a lightweight representation that captures code structures (files, classes, functions) and their dependencies (imports, invocations, inheritance), enabling LLM agents to effectively search and locate relevant entities through powerful multi-hop reasoning.

GitHub issue resolution Navigate

258
0.46 stars / hour

LHM: Large Animatable Human Reconstruction Model from a Single Image in Seconds

aigc3d/LHM 13 Mar 2025

Animatable 3D human reconstruction from a single image is a challenging problem due to the ambiguity in decoupling geometry, appearance, and deformation.

3D Human Reconstruction

1,832
0.46 stars / hour

Search-R1: Training LLMs to Reason and Leverage Search Engines with Reinforcement Learning

petergriffinjin/search-r1 12 Mar 2025

Efficiently acquiring external knowledge and up-to-date information is essential for effective reasoning and text generation in large language models (LLMs).

Question Answering Reinforcement Learning (RL) +2

1,911
0.46 stars / hour

ConRFT: A Reinforced Fine-tuning Method for VLA Models via Consistency Policy

cccedric/conrft 8 Feb 2025

This work highlights the potential of integrating reinforcement learning to enhance the performance of VLA models for real-world robotic applications.

Q-Learning Safe Exploration

23
0.44 stars / hour