TriForce: Lossless Acceleration of Long Sequence Generation with Hierarchical Speculative Decoding

Infini-AI-Lab/TriForce 18 Apr 2024

However, key-value (KV) cache, which is stored to avoid re-computation, has emerged as a critical bottleneck by growing linearly in size with the sequence length.

92
0.50 stars / hour

InternVL: Scaling up Vision Foundation Models and Aligning for Generic Visual-Linguistic Tasks

opengvlab/internvl 21 Dec 2023

However, the progress in vision and vision-language foundation models, which are also critical elements of multi-modal AGI, has not kept pace with LLMs.

 Ranked #1 on Zero-Shot Video Retrieval on MSR-VTT-full (using extra training data)

Image Retrieval Image-to-Text Retrieval +10

844
0.50 stars / hour

Rethinking Inductive Biases for Surface Normal Estimation

baegwangbin/dsine 1 Mar 2024

Despite the growing demand for accurate surface normal estimation models, existing methods use general-purpose dense prediction models, adopting the same inductive biases as other tasks.

Surface Normal Estimation

476
0.48 stars / hour

Efficient Multimodal Learning from Data-centric Perspective

baai-dcai/bunny 18 Feb 2024

Multimodal Large Language Models (MLLMs) have demonstrated notable capabilities in general visual understanding and reasoning tasks.

550
0.47 stars / hour

LongEmbed: Extending Embedding Models for Long Context Retrieval

dwzhu-pku/longembed 18 Apr 2024

This paper explores context window extension of existing embedding models, pushing the limit to 32k without requiring additional training.

4k 8k +3

63
0.47 stars / hour

Megalodon: Efficient LLM Pretraining and Inference with Unlimited Context Length

xuezhemax/megalodon 12 Apr 2024

The quadratic complexity and weak length extrapolation of Transformers limits their ability to scale to long sequences, and while sub-quadratic solutions like linear attention and state space models exist, they empirically underperform Transformers in pretraining efficiency and downstream task accuracy.

325
0.46 stars / hour

BAdam: A Memory Efficient Full Parameter Training Method for Large Language Models

ledzy/badam 3 Apr 2024

This work presents BAdam, an optimizer that leverages the block coordinate optimization framework with Adam as the inner solver.

105
0.45 stars / hour

Leave No Context Behind: Efficient Infinite Context Transformers with Infini-attention

Beomi/InfiniTransformer 10 Apr 2024

This work introduces an efficient method to scale Transformer-based Large Language Models (LLMs) to infinitely long inputs with bounded memory and computation.

Book summarization Language Modelling +1

193
0.43 stars / hour

State Space Model for New-Generation Network Alternative to Transformers: A Survey

event-ahu/mamba_state_space_model_paper_list 15 Apr 2024

In this paper, we give the first comprehensive review of these works and also provide experimental comparisons and analysis to better demonstrate the features and advantages of SSM.

327
0.42 stars / hour

LongQLoRA: Efficient and Effective Method to Extend Context Length of Large Language Models

yangjianxin1/firefly 8 Nov 2023

We present LongQLoRA, an efficient and effective method to extend context length of large language models with less training resources.

8k

4,547
0.41 stars / hour