Trending Research

OMG-Seg: Is One Model Good Enough For All Segmentation?

lxtgh/omg-seg • • 18 Jan 2024

In this work, we address various segmentation tasks, each traditionally tackled by distinct or partially unified models.

Decoder Interactive Segmentation +4

776

0.27 stars / hour

Paper
Code

Demystify Mamba in Vision: A Linear Attention Perspective

LeapLabTHU/MLLA • • 26 May 2024

By exploring the similarities and disparities between the effective Mamba and subpar linear attention Transformer, we provide comprehensive analyses to demystify the key factors behind Mamba's success.

Image Classification

0.27 stars / hour

Paper
Code

Efficient Guided Generation for Large Language Models

normal-computing/outlines • • 19 Jul 2023

In this article we show how the problem of neural text generation can be constructively reformulated in terms of transitions between the states of a finite-state machine.

Language Modelling Text Generation

6,333

0.26 stars / hour

Paper
Code

Speaker Embedding-aware Neural Diarization for Flexible Number of Speakers with Textual Information

alibaba-damo-academy/FunASR • • 28 Nov 2021

In this paper, we reformulate this task as a single-label prediction problem by encoding the multi-speaker labels with power set.

Action Detection Activity Detection +2

4,027

0.20 stars / hour

Paper
Code

emotion2vec: Self-Supervised Pre-Training for Speech Emotion Representation

alibaba-damo-academy/FunASR • • 23 Dec 2023

To the best of our knowledge, emotion2vec is the first universal representation model in various emotion-related tasks, filling a gap in the field.

Self-Supervised Learning Sentiment Analysis +1

4,027

0.26 stars / hour

Paper
Code

DeepSeek-V2: A Strong, Economical, and Efficient Mixture-of-Experts Language Model

deepseek-ai/deepseek-v2 • • 7 May 2024

MLA guarantees efficient inference through significantly compressing the Key-Value (KV) cache into a latent vector, while DeepSeekMoE enables training strong models at an economical cost through sparse computation.

Language Modelling Reinforcement Learning (RL)

2,385

0.26 stars / hour

Paper
Code

Instruct-MusicGen: Unlocking Text-to-Music Editing for Music Language Models via Instruction Tuning

ldzhangyx/instruct-MusicGen • • 28 May 2024

Recent advances in text-to-music editing, which employ text queries to modify music (e. g.\ by changing its style or adjusting instrumental components), present unique challenges and opportunities for AI-assisted music creation.

0.25 stars / hour

Paper
Code

ViG: Linear-complexity Visual Sequence Learning with Gated Linear Attention

hustvl/vig • 28 May 2024

Recently, linear complexity sequence modeling networks have achieved modeling capabilities similar to Vision Transformers on a variety of computer vision tasks, while using fewer FLOPs and less memory.

Representation Learning