BLIP3-o: A Family of Fully Open Unified Multimodal Models-Architecture, Training and Dataset

jiuhaichen/blip3o 14 May 2025

Building on our innovative model design, training recipe, and datasets, we develop BLIP3-o, a suite of state-of-the-art unified multimodal models.

Image Generation

750
3.45 stars / hour

Aligning Anime Video Generation with Human Feedback

bilibili/index-anisora 14 Apr 2025

Existing reward models, designed primarily for real-world videos, fail to capture the unique appearance and consistency requirements of anime.

Video Generation

564
2.14 stars / hour

AlphaEvolve: A Learning Framework to Discover Novel Alphas in Quantitative Investment

codelion/openevolve 30 Mar 2021

In this paper, we introduce a new class of alphas to model scalar, vector, and matrix features which possess the strengths of these two existing classes.

AutoML Stock Prediction

404
1.89 stars / hour

Parallel Scaling Law for Language Models

qwenlm/parscale 15 May 2025

We apply $P$ diverse and learnable transformations to the input, execute forward passes of the model in parallel, and dynamically aggregate the $P$ outputs.

271
1.77 stars / hour

Fully Open Source Moxin-7B Technical Report

moxin-org/moxin-llm 8 Dec 2024

Recently, Large Language Models (LLMs) have undergone a significant transformation, marked by a rapid rise in both their popularity and capabilities.

356
1.45 stars / hour

FastVLM: Efficient Vision Encoding for Vision Language Models

apple/ml-fastvlm 17 Dec 2024

At different operational resolutions, the vision encoder of a VLM can be optimized along two axes: reducing encoding latency and minimizing the number of visual tokens passed to the LLM, thereby lowering overall latency.

3,668
1.36 stars / hour

Thinkless: LLM Learns When to Think

vainf/thinkless 19 May 2025

Reasoning Language Models, capable of extended chain-of-thought reasoning, have demonstrated remarkable performance on tasks requiring complex logical inference.

58
1.33 stars / hour

OpenThinkIMG: Learning to Think with Images via Visual Tool Reinforcement Learning

zhaochen0110/openthinkimg 13 May 2025

We hope OpenThinkIMG can serve as a foundational framework for advancing dynamic, tool-augmented visual reasoning, helping the community develop AI agents that can genuinely "think with images".

Reinforcement Learning (RL) Visual Reasoning

166
0.99 stars / hour

MASS: Multi-Agent Simulation Scaling for Portfolio Construction

gta0804/mass 15 May 2025

LLM-based multi-agent has gained significant attention for their potential in simulation and enhancing performance.

106
0.96 stars / hour

WorldPM: Scaling Human Preference Modeling

qwenlm/worldpm 15 May 2025

Motivated by scaling laws in language modeling that demonstrate how test loss scales as a power law with model and dataset sizes, we find that similar laws exist in preference modeling.

Language Modeling Language Modelling

70
0.77 stars / hour