Recollection from Pensieve: Novel View Synthesis via Learning from Uncalibrated Videos

dwawayu/pensieve 19 May 2025

In the first stage, we learn to reconstruct the scene implicitly in a latent space without relying on any explicit 3D representation.

3D geometry Camera Pose Estimation +2

17
0.38 stars / hour

VTBench: Evaluating Visual Tokenizers for Autoregressive Image Generation

huawei-lin/VTBench 19 May 2025

To address this gap, we introduce VTBench, a comprehensive benchmark that systematically evaluates VTs across three core tasks: Image Reconstruction, Detail Preservation, and Text Preservation, and covers a diverse range of evaluation scenarios.

Image Generation Image Reconstruction

17
0.38 stars / hour

Reservoir-enhanced Segment Anything Model for Subsurface Diagnosis

zhouxr6066/Res-SAM 26 Apr 2025

Urban roads and infrastructure, vital to city operations, face growing threats from subsurface anomalies like cracks and cavities.

Anomaly Detection GPR +1

38
0.38 stars / hour

Finetune-RAG: Fine-Tuning Language Models to Resist Hallucination in Retrieval-Augmented Generation

Pints-AI/Finetune-Bench-RAG 16 May 2025

Retrieval-Augmented Generation (RAG) has emerged as a powerful framework to improve factuality in large language models (LLMs) by grounding their outputs in retrieved documents.

Hallucination RAG +2

24
0.37 stars / hour

qlib

microsoft/qlib 3 Oct 2020

Qlib is an AI-oriented quantitative investment platform that aims to realize the potential, empower research, and create value using AI technologies in quantitative investment, from exploring ideas to implementing productions.

BIG-bench Machine Learning feature selection

19,623
0.36 stars / hour

Scaling Video-Language Models to 10K Frames via Hierarchical Differential Distillation

steven-ccq/vilamp 3 Apr 2025

Based on this principle, we develop ViLAMP, a hierarchical video-language model that processes hour-long videos at "mixed precision" through two key mechanisms: (1) differential keyframe selection that maximizes query relevance while maintaining temporal distinctiveness at the frame level and (2) differential feature merging that preserves query-salient features in non-keyframes at the patch level.

Computational Efficiency Language Modeling +3

135
0.35 stars / hour

Cosmos-Reason1: From Physical Common Sense To Embodied Reasoning

nvidia-cosmos/cosmos-reason1 18 Mar 2025

We begin by defining key capabilities for Physical AI reasoning, with a focus on physical common sense and embodied reasoning.

3D Face Animation Common Sense Reasoning +1

367
0.35 stars / hour

AlphaNet: Scaling Up Local-frame-based Atomistic Interatomic Potential

zmyybc/alphanet 13 Jan 2025

Molecular dynamics simulations demand an unprecedented combination of accuracy and scalability to tackle grand challenges in catalysis and materials design.

Computational Efficiency

103
0.34 stars / hour

Do Large Language Models Need a Content Delivery Network?

lmcache/lmcache 16 Sep 2024

As the use of large language models (LLMs) expands rapidly, so does the range of knowledge needed to supplement various LLM queries.

In-Context Learning

1,123
0.34 stars / hour

Make LoRA Great Again: Boosting LoRA with Adaptive Singular Values and Mixture-of-Experts Optimization Alignment

facico/goat-peft 24 Feb 2025

While Low-Rank Adaptation (LoRA) enables parameter-efficient fine-tuning for Large Language Models (LLMs), its performance often falls short of Full Fine-Tuning (Full FT).

Image Classification Mixture-of-Experts +3

12
0.33 stars / hour