143
4.94 stars / hour

Depth Pro: Sharp Monocular Metric Depth in Less Than a Second

apple/ml-depth-pro 2 Oct 2024

We present a foundation model for zero-shot metric monocular depth estimation.

Monocular Depth Estimation

2,594
3.06 stars / hour

MLE-bench: Evaluating Machine Learning Agents on Machine Learning Engineering

openai/mle-bench 9 Oct 2024

We introduce MLE-bench, a benchmark for measuring how well AI agents perform at machine learning engineering.

85
3.00 stars / hour

Posterior-Mean Rectified Flow: Towards Minimum MSE Photo-Realistic Image Restoration

ohayonguy/PMRF 1 Oct 2024

Photo-realistic image restoration algorithms are typically evaluated by distortion measures (e. g., PSNR, SSIM) and by perceptual quality measures (e. g., FID, NIQE), where the desire is to attain the lowest possible distortion without compromising on perceptual quality.

 Ranked #1 on Blind Face Restoration on CelebA-Test (FID metric)

Blind Face Restoration Image Colorization +5

303
1.99 stars / hour

Deciphering Cross-Modal Alignment in Large Vision-Language Models with Modality Integration Rate

shikiw/modality-integration-rate 9 Oct 2024

We present the Modality Integration Rate (MIR), an effective, robust, and generalized metric to indicate the multi-modal pre-training quality of Large Vision Language Models (LVLMs).

cross-modal alignment

50
1.42 stars / hour

LongWriter: Unleashing 10,000+ Word Generation from Long Context LLMs

thudm/longwriter 13 Aug 2024

By incorporating this dataset into model training, we successfully scale the output length of existing models to over 10, 000 words while maintaining output quality.

1,395
1.30 stars / hour

Scaling Proprioceptive-Visual Learning with Heterogeneous Pre-trained Transformers

liruiw/HPT 30 Sep 2024

Previous robot learning methods often collect data to train with one specific embodiment for one task, which is expensive and prone to overfitting.

236
1.12 stars / hour

"Do Anything Now": Characterizing and Evaluating In-The-Wild Jailbreak Prompts on Large Language Models

verazuo/jailbreak_llms 7 Aug 2023

We hope that our study can facilitate the research community and LLM vendors in promoting safer and regulated LLMs.

Community Detection

2,558
1.08 stars / hour

LLaMA-Omni: Seamless Speech Interaction with Large Language Models

ictnlp/llama-omni 10 Sep 2024

We build our model based on the latest Llama-3. 1-8B-Instruct model.

2,315
0.76 stars / hour

VPTQ: Extreme Low-bit Vector Post-Training Quantization for Large Language Models

microsoft/vptq 25 Sep 2024

Due to the redundancy in LLM weights, recent research has focused on pushing weight-only quantization to extremely low-bit (even down to 2 bits).

Quantization

369
0.74 stars / hour