Trending Research

EVA-X: A Foundation Model for General Chest X-ray Analysis with Self-supervised Learning

hustvl/eva-x • 8 May 2024

The diagnosis and treatment of chest diseases play a crucial role in maintaining human health.

Few-Shot Learning Self-Supervised Learning

0.38 stars / hour

Paper
Code

OpenESS: Event-based Semantic Scene Understanding with Open Vocabularies

ldkong1205/openess • 8 May 2024

Event-based semantic segmentation (ESS) is a fundamental yet challenging task for event camera sensing.

Domain Adaptation Scene Understanding +1

0.38 stars / hour

Paper
Code

DeepSeek LLM: Scaling Open-Source Language Models with Longtermism

deepseek-ai/deepseek-llm • • 5 Jan 2024

The rapid development of open-source large language models (LLMs) has been truly remarkable.

1,202

0.37 stars / hour

Paper
Code

X-LoRA: Mixture of Low-Rank Adapter Experts, a Flexible Framework for Large Language Models with Applications in Protein Mechanics and Molecular Design

ericlbuehler/mistral.rs • 11 Feb 2024

Starting with a set of pre-trained LoRA adapters, our gating strategy uses the hidden states to dynamically mix adapted layers, allowing the resulting X-LoRA model to draw upon different capabilities and create never-before-used deep layer-wise combinations to solve tasks.

graph construction Knowledge Graphs +3

1,473

0.37 stars / hour

Paper
Code

Visual Autoregressive Modeling: Scalable Image Generation via Next-Scale Prediction

FoundationVision/VAR • • 3 Apr 2024

We present Visual AutoRegressive modeling (VAR), a new generation paradigm that redefines the autoregressive learning on images as coarse-to-fine "next-scale prediction" or "next-resolution prediction", diverging from the standard raster-scan "next-token prediction".

Ranked #7 on Image Generation on ImageNet 256x256

Image Generation Language Modelling +2

3,445

0.36 stars / hour

Paper
Code

Model Stock: All we need is just a few fine-tuned models

arcee-ai/mergekit • • 28 Mar 2024

This paper introduces an efficient fine-tuning method for large pre-trained models, offering strong in-distribution (ID) and out-of-distribution (OOD) performance.

3,569

0.36 stars / hour

Paper
Code

DeepSeek-VL: Towards Real-World Vision-Language Understanding

deepseek-ai/deepseek-vl • • 8 Mar 2024

The DeepSeek-VL family (both 1. 3B and 7B models) showcases superior user experiences as a vision-language chatbot in real-world applications, achieving state-of-the-art or competitive performance across a wide range of visual-language benchmarks at the same model size while maintaining robust performance on language-centric benchmarks.

Ranked #30 on Visual Question Answering on MM-Vet

Chatbot Language Modelling +3

1,607

0.36 stars / hour

Paper
Code

UFO: A UI-Focused Agent for Windows OS Interaction

microsoft/UFO • 8 Feb 2024

We introduce UFO, an innovative UI-Focused agent to fulfill user requests tailored to applications on Windows OS, harnessing the capabilities of GPT-Vision.

Navigate

4,529

0.35 stars / hour

Paper
Code

DoRA: Weight-Decomposed Low-Rank Adaptation

NVlabs/DoRA • • 14 Feb 2024

By employing DoRA, we enhance both the learning capacity and training stability of LoRA while avoiding any additional inference overhead.

144

0.34 stars / hour

Paper
Code

Branch-Train-MiX: Mixing Expert LLMs into a Mixture-of-Experts LLM

Leeroo-AI/mergoo • • 12 Mar 2024

We investigate efficient methods for training Large Language Models (LLMs) to possess capabilities in multiple specialized domains, such as coding, math reasoning and world knowledge.

Ranked #30 on Question Answering on TriviaQA

Arithmetic Reasoning Code Generation +6

296

0.33 stars / hour

Paper
Code