Recent Advances of Multimodal Continual Learning: A Comprehensive Survey

lucydyu/awesome-multimodal-continual-learning 7 Oct 2024

Continual learning (CL) aims to empower machine learning models to learn continually from new data, while building upon previously acquired knowledge without forgetting.

Continual Learning Survey

17
0.28 stars / hour

Better Call SAL: Towards Learning to Segment Anything in Lidar

nv-dvl/segment-anything-lidar 19 Mar 2024

We propose the SAL (Segment Anything in Lidar) method consisting of a text-promptable zero-shot model for segmenting and classifying any object in Lidar, and a pseudo-labeling engine that facilitates model training without manual supervision.

Panoptic Segmentation Segmentation

58
0.27 stars / hour

MOSS: Enabling Code-Driven Evolution and Context Management for AI Agents

ghost-in-moss/ghostos 24 Sep 2024

MOSS ensures consistency and adaptability by using a mechanism that maintains the Python context across interactions, including isolation of local variables and preservation of runtime integrity.

Code Generation Management

38
0.27 stars / hour

SegFormer: Simple and Efficient Design for Semantic Segmentation with Transformers

VikParuchuri/surya NeurIPS 2021

We present SegFormer, a simple, efficient yet powerful semantic segmentation framework which unifies Transformers with lightweight multilayer perception (MLP) decoders.

C++ code Crack Segmentation +2

11,125
0.27 stars / hour

PuLID: Pure and Lightning ID Customization via Contrastive Alignment

tothebeginning/pulid 24 Apr 2024

We propose Pure and Lightning ID customization (PuLID), a novel tuning-free ID customization method for text-to-image generation.

Text-to-Image Generation

2,275
0.26 stars / hour

PrefixQuant: Static Quantization Beats Dynamic through Prefixed Outliers in LLMs

chenmnz/prefixquant 7 Oct 2024

Specifically, PrefixQuant identifies high-frequency outlier tokens and prefixes them in the KV cache, preventing the generation of outlier tokens during inference and simplifying quantization.

Common Sense Reasoning Quantization

50
0.26 stars / hour

A Spark of Vision-Language Intelligence: 2-Dimensional Autoregressive Transformer for Efficient Finegrained Image Generation

chenllliang/dnd-transformer 2 Oct 2024

This work tackles the information loss bottleneck of vector-quantization (VQ) autoregressive image generation by introducing a novel model architecture called the 2-Dimensional Autoregression (DnD) Transformer.

Image Generation Quantization

42
0.29 stars / hour

SyllableLM: Learning Coarse Semantic Units for Speech Language Models

alanbaade/SyllableLM 5 Oct 2024

For speech in particular, the high resolution of waveforms (16, 000 samples/second or more) presents a significant challenge as speech-based language models have had to use several times more tokens per word than text-based language models.

Clustering Language Modelling +1

30
0.26 stars / hour

optillm

codelion/optillm 12 Sep 2023

Optimizing inference proxy for LLMs

Decoder

1,073
0.26 stars / hour

SWIFT:A Scalable lightWeight Infrastructure for Fine-Tuning

modelscope/ms-swift 10 Aug 2024

With support of over $300+$ LLMs and $50+$ MLLMs, SWIFT stands as the open-source framework that provide the most comprehensive support for fine-tuning large models.

Hallucination Optical Character Recognition +6

3,819
0.25 stars / hour