LLaVA-UHD: an LMM Perceiving Any Aspect Ratio and High-Resolution Images

openbmb/minicpm-v 18 Mar 2024

To address the challenges, we present LLaVA-UHD, a large multimodal model that can efficiently perceive images in any aspect ratio and high resolution.

3,163
6.19 stars / hour

A decoder-only foundation model for time-series forecasting

google-research/timesfm 14 Oct 2023

Motivated by recent advances in large language models for Natural Language Processing (NLP), we design a time-series foundation model for forecasting whose out-of-the-box zero-shot performance on a variety of public datasets comes close to the accuracy of state-of-the-art supervised forecasting models for each individual dataset.

Decoder Time Series +1

2,478
1.61 stars / hour

Grounding DINO 1.5: Advance the "Edge" of Open-Set Object Detection

idea-research/grounding-dino-1.5-api 16 May 2024

Empirical results demonstrate the effectiveness of Grounding DINO 1. 5, with the Grounding DINO 1. 5 Pro model attaining a 54. 3 AP on the COCO detection benchmark and a 55. 7 AP on the LVIS-minival zero-shot transfer benchmark, setting new records for open-set object detection.

 Ranked #1 on Zero-Shot Object Detection on LVIS v1.0 val (using extra training data)

Edge-computing object-detection +1

357
1.55 stars / hour

Hunyuan-DiT: A Powerful Multi-Resolution Diffusion Transformer with Fine-Grained Chinese Understanding

tencent/hunyuandit 14 May 2024

For fine-grained language understanding, we train a Multimodal Large Language Model to refine the captions of the images.

Image Generation Language Modelling +2

1,823
1.44 stars / hour

How Far Are We to GPT-4V? Closing the Gap to Commercial Multimodal Models with Open-Source Suites

opengvlab/internvl 25 Apr 2024

Compared to both open-source and proprietary models, InternVL 1. 5 shows competitive performance, achieving state-of-the-art results in 8 of 18 benchmarks.

4k Language Modelling +3

2,722
1.20 stars / hour

LightAutoML: AutoML Solution for a Large Financial Services Ecosystem

sb-ai-lab/lightautoml 3 Sep 2021

We present an AutoML system called LightAutoML developed for a large European financial services company and its ecosystem satisfying the set of idiosyncratic requirements that this ecosystem has for AutoML solutions.

AutoML

1,020
1.13 stars / hour

How Far Are We From AGI

ulab-uiuc/agi-survey 16 May 2024

The evolution of artificial intelligence (AI) has profoundly impacted human society, driving significant advancements in multiple sectors.

215
0.94 stars / hour

KAN: Kolmogorov-Arnold Networks

Blealtan/efficient-kan 30 Apr 2024

Inspired by the Kolmogorov-Arnold representation theorem, we propose Kolmogorov-Arnold Networks (KANs) as promising alternatives to Multi-Layer Perceptrons (MLPs).

2,768
0.93 stars / hour

Layer-Condensed KV Cache for Efficient Inference of Large Language Models

whyNLP/LCKV 17 May 2024

In this paper, we propose a novel method that only computes and caches the KVs of a small number of layers, thus significantly saving memory consumption and improving inference throughput.

Language Modelling

76
0.79 stars / hour

Efficient Multimodal Large Language Models: A Survey

lijiannuist/efficient-multimodal-llms-survey 17 May 2024

In the past year, Multimodal Large Language Models (MLLMs) have demonstrated remarkable performance in tasks such as visual question answering, visual understanding and reasoning.

Edge-computing Question Answering +1

55
0.79 stars / hour