Scalable Pre-training of Large Autoregressive Image Models

lightly-ai/lightly 16 Jan 2024

Specifically, we highlight two key findings: (1) the performance of the visual features scale with both the model capacity and the quantity of data, (2) the value of the objective function correlates with the performance of the model on downstream tasks.

Image Classification

3,066
0.39 stars / hour

3DTopia-XL: Scaling High-quality 3D Asset Generation via Primitive Diffusion

3dtopia/3dtopia-xl 19 Sep 2024

The increasing demand for high-quality 3D assets across various industries necessitates efficient and automated 3D content creation.

633
0.38 stars / hour

Segment Anything without Supervision

frank-xwang/unsam 28 Jun 2024

By integrating our unsupervised pseudo masks into SA-1B's ground-truth masks and training UnSAM with only 1% of SA-1B, a lightly semi-supervised UnSAM can often segment entities overlooked by supervised SAM, exceeding SAM's AR by over 6. 7% and AP by 3. 9% on SA-1B.

Clustering Image Segmentation +2

355
0.38 stars / hour

GeoCalib: Learning Single-image Calibration with Geometric Optimization

cvg/geocalib 10 Sep 2024

This single-image calibration can benefit various downstream applications like image editing and 3D mapping.

3D geometry Visual Localization

359
0.38 stars / hour

Reinforcement Learning Meets Visual Odometry

uzh-rpg/rl_vo 22 Jul 2024

Despite recent advances, existing VO methods still rely on heuristic design choices that require several weeks of hyperparameter tuning by human experts, hindering generalizability and robustness.

Decision Making reinforcement-learning +3

151
0.36 stars / hour

A Survey on the Honesty of Large Language Models

sihengli99/llm-honesty-survey 27 Sep 2024

Honesty is a fundamental principle for aligning large language models (LLMs) with human values, requiring these models to recognize what they know and don't know and be able to faithfully express their knowledge.

28
0.35 stars / hour

Fast Inference from Transformers via Speculative Decoding

ericlbuehler/mistral.rs 30 Nov 2022

Inference from large autoregressive models like Transformers is slow - decoding K tokens takes K serial runs of the model.

Language Modelling

3,590
0.33 stars / hour

LongWriter: Unleashing 10,000+ Word Generation from Long Context LLMs

thudm/longwriter 13 Aug 2024

By incorporating this dataset into model training, we successfully scale the output length of existing models to over 10, 000 words while maintaining output quality.

1,151
0.33 stars / hour

Grounding Image Matching in 3D with MASt3R

naver/mast3r 14 Jun 2024

Image Matching is a core component of all best-performing algorithms and pipelines in 3D vision.

3D Reconstruction

922
0.33 stars / hour

LexEval: A Comprehensive Chinese Legal Benchmark for Evaluating Large Language Models

cshaitao/lexeval 30 Sep 2024

Applying existing LLMs to legal systems without careful evaluation of their potential and limitations could pose significant risks in legal practice.

Fairness

28
0.33 stars / hour