ReasonGraph: Visualisation of Reasoning Paths

ZongqianLi/ReasonGraph 6 Mar 2025

Large Language Models (LLMs) reasoning processes are challenging to analyze due to their complexity and the lack of organized visualization tools.

409
0.63 stars / hour

TikZero: Zero-Shot Text-Guided Graphics Program Synthesis

potamides/detikzify 14 Mar 2025

Meanwhile, large amounts of unaligned graphics programs and captioned raster images are more readily available.

Program Synthesis

762
0.62 stars / hour

FinRobot: AI Agent for Equity Research and Valuation with Large Language Models

ai4finance-foundation/finrobot 13 Nov 2024

The system is structured around three specialized agents: the Data-CoT Agent, which aggregates diverse data sources for robust financial integration; the Concept-CoT Agent, which mimics an analysts reasoning to generate actionable insights; and the Thesis-CoT Agent, which synthesizes these insights into a coherent investment thesis and report.

AI Agent

2,793
0.61 stars / hour

Cosmos-Transfer1: Conditional World Generation with Adaptive Multimodal Control

nvidia-cosmos/cosmos-transfer1 18 Mar 2025

We introduce Cosmos-Transfer, a conditional world generation model that can generate world simulations based on multiple spatial control inputs of various modalities such as segmentation, depth, and edge.

253
0.59 stars / hour

Falcon: A Remote Sensing Vision-Language Foundation Model

tianhuilab/falcon 14 Mar 2025

This paper introduces a holistic vision-language foundation model tailored for remote sensing, named Falcon.

Image Captioning Image Classification +2

109
0.59 stars / hour

Seeing the Future, Perceiving the Future: A Unified Driving World Model for Future Generation and Perception

dk-liang/unifuture 17 Mar 2025

Extensive experiments on the nuScenes dataset demonstrate that UniFuture outperforms specialized models on future generation and perception tasks, highlighting the advantages of a unified, structurally-aware world model.

Future prediction Scene Generation

86
0.58 stars / hour

Multimodal Chain-of-Thought Reasoning: A Comprehensive Survey

yaotingwangofficial/awesome-mcot 16 Mar 2025

By extending the advantage of chain-of-thought (CoT) reasoning in human-like step-by-step processes to multimodal contexts, multimodal CoT (MCoT) reasoning has recently garnered significant research attention, especially in the integration with multimodal large language models (MLLMs).

Autonomous Driving multimodal generation +1

222
0.58 stars / hour

HybridFlow: A Flexible and Efficient RLHF Framework

volcengine/verl 28 Sep 2024

Traditional RL can be modeled as a dataflow, where each node represents computation of a neural network (NN) and each edge denotes data dependencies between the NNs.

Large Language Model

5,633
0.58 stars / hour

Sparse Voxels Rasterization: Real-time High-fidelity Radiance Field Rendering

nvlabs/svraster 5 Dec 2024

We propose an efficient radiance field rendering algorithm that incorporates a rasterization process on adaptive sparse voxels without neural networks or 3D Gaussians.

Novel View Synthesis

546
0.56 stars / hour

Zep: A Temporal Knowledge Graph Architecture for Agent Memory

getzep/graphiti 20 Jan 2025

We introduce Zep, a novel memory layer service for AI agents that outperforms the current state-of-the-art system, MemGPT, in the Deep Memory Retrieval (DMR) benchmark.

RAG Retrieval

2,707
0.56 stars / hour