Trending Research

MarkLLM: An Open-Source Toolkit for LLM Watermarking

thu-bpm/markllm • • 16 May 2024

However, the abundance of LLM watermarking algorithms, their intricate mechanisms, and the complex evaluation procedures and perspectives pose challenges for researchers and the community to easily experiment with, understand, and assess the latest advancements.

132

0.84 stars / hour

Paper
Code

Uni-MoE: Scaling Unified Multimodal LLMs with Mixture of Experts

hitsz-tmg/umoe-scaling-unified-multimodal-llms • • 18 May 2024

Although the Mixture of Experts (MoE) architecture has been employed to efficiently scale large language and image-text models, these efforts typically involve fewer experts and limited modalities.

668

0.83 stars / hour

Paper
Code

KAN: Kolmogorov-Arnold Networks

Blealtan/efficient-kan • • 30 Apr 2024

Inspired by the Kolmogorov-Arnold representation theorem, we propose Kolmogorov-Arnold Networks (KANs) as promising alternatives to Multi-Layer Perceptrons (MLPs).

3,036

0.77 stars / hour

Paper
Code

Real-time Transformer-based Open-Vocabulary Detection with Efficient Fusion Head

om-ai-lab/OmDet • • 11 Mar 2024

End-to-end transformer-based detectors (DETRs) have shown exceptional performance in both closed-set and open-vocabulary object detection (OVD) tasks through the integration of language modalities.

Object object-detection +2

277

0.73 stars / hour

Paper
Code

OpenRLHF: An Easy-to-use, Scalable and High-performance RLHF Framework

OpenLLMAI/OpenRLHF • • 20 May 2024

However, unlike pretraining or fine-tuning a single model, scaling reinforcement learning from human feedback (RLHF) for training large language models poses coordination challenges across four models.

reinforcement-learning Scheduling

1,447

0.70 stars / hour

Paper
Code

MambaOut: Do We Really Need Mamba for Vision?

yuweihao/mambaout • • 13 May 2024

For vision tasks, as image classification does not align with either characteristic, we hypothesize that Mamba is not necessary for this task; Detection and segmentation tasks are also not autoregressive, yet they adhere to the long-sequence characteristic, so we believe it is still worthwhile to explore Mamba's potential for these tasks.

Image Classification Instance Segmentation +2

1,739

0.67 stars / hour

Paper
Code

How Far Are We From AGI

ulab-uiuc/agi-survey • 16 May 2024

The evolution of artificial intelligence (AI) has profoundly impacted human society, driving significant advancements in multiple sectors.

246

0.65 stars / hour

Paper
Code

TensorIR: An Abstraction for Automatic Tensorized Program Optimization

mlc-ai/web-llm • • 9 Jul 2022

Finally, we build an end-to-end framework on top of our abstraction to automatically optimize deep learning models for given tensor computation primitives.

BIG-bench Machine Learning

10,880

0.58 stars / hour

Paper
Code

Efficient Multimodal Large Language Models: A Survey

lijiannuist/efficient-multimodal-llms-survey • 17 May 2024

In the past year, Multimodal Large Language Models (MLLMs) have demonstrated remarkable performance in tasks such as visual question answering, visual understanding and reasoning.

Edge-computing Question Answering +1

101

0.57 stars / hour

Paper
Code

Your Transformer is Secretly Linear

AIRI-Institute/LLM-Microscope • • 19 May 2024

This regularization improves performance metrics on benchmarks like Tiny Stories and SuperGLUE and as well successfully decreases the linearity of the models.

0.56 stars / hour

Paper
Code