Human-like Episodic Memory for Infinite Context LLMs

em-llm/EM-LLM-model 12 Jul 2024

Large language models (LLMs) have shown remarkable capabilities, but still struggle with processing extensive contexts, limiting their ability to maintain coherence and accuracy over long sequences.

Computational Efficiency Event Segmentation +2

192
0.99 stars / hour

OpenThinkIMG: Learning to Think with Images via Visual Tool Reinforcement Learning

zhaochen0110/openthinkimg 13 May 2025

We hope OpenThinkIMG can serve as a foundational framework for advancing dynamic, tool-augmented visual reasoning, helping the community develop AI agents that can genuinely "think with images".

Reinforcement Learning (RL) Visual Reasoning

128
0.97 stars / hour

Fully Open Source Moxin-7B Technical Report

moxin-org/moxin-llm 8 Dec 2024

Recently, Large Language Models (LLMs) have undergone a significant transformation, marked by a rapid rise in both their popularity and capabilities.

273
0.88 stars / hour

Flow-GRPO: Training Flow Matching Models via Online RL

yifan123/flow_grpo 8 May 2025

We propose Flow-GRPO, the first method integrating online reinforcement learning (RL) into flow matching models.

Denoising Diversity +3

553
0.86 stars / hour

UniVLA: Learning to Act Anywhere with Task-centric Latent Actions

opendrivelab/univla 9 May 2025

Learned from internet-scale videos, the generalist policy can be deployed to various robots through efficient latent action decoding.

Vision-Language-Action

254
0.85 stars / hour

Generating Physically Stable and Buildable LEGO Designs from Text

AvaLovelace1/LegoGPT 8 May 2025

Our experiments show that LegoGPT produces stable, diverse, and aesthetically pleasing LEGO designs that align closely with the input text prompts.

3D Generation Large Language Model +1

1,067
0.78 stars / hour

Paper2Code: Automating Code Generation from Scientific Papers in Machine Learning

going-doer/paper2code 24 Apr 2025

Despite the rapid growth of machine learning research, corresponding code implementations are often unavailable, making it slow and labor-intensive for researchers to reproduce results and build upon prior work.

Code Generation

2,095
0.72 stars / hour

Attentive Reasoning Queries: A Systematic Method for Optimizing Instruction-Following in Large Language Models

emcie-co/parlant 5 Mar 2025

We present Attentive Reasoning Queries (ARQs), a novel structured reasoning approach that significantly improves instruction-following in Large Language Models through domain-specialized reasoning blueprints.

Hallucination Instruction Following +1

2,901
0.68 stars / hour

Aligning Anime Video Generation with Human Feedback

bilibili/index-anisora 14 Apr 2025

Existing reward models, designed primarily for real-world videos, fail to capture the unique appearance and consistency requirements of anime.

Video Generation

255
0.67 stars / hour

LTX-Video: Realtime Video Latent Diffusion

Lightricks/LTX-Video 30 Dec 2024

To address this, our VAE decoder is tasked with both latent-to-pixel conversion and the final denoising step, producing the clean result directly in pixel space.

Denoising Image to Video Generation

5,777
0.63 stars / hour