Trending Research

Branch-Train-MiX: Mixing Expert LLMs into a Mixture-of-Experts LLM

Leeroo-AI/mergoo • • 12 Mar 2024

We investigate efficient methods for training Large Language Models (LLMs) to possess capabilities in multiple specialized domains, such as coding, math reasoning and world knowledge.

Ranked #30 on Question Answering on TriviaQA

Arithmetic Reasoning Code Generation +6

179

0.87 stars / hour

Paper
Code

StreamingT2V: Consistent, Dynamic, and Extendable Long Video Generation from Text

picsart-ai-research/streamingt2v • • 21 Mar 2024

To overcome these limitations, we introduce StreamingT2V, an autoregressive approach for long video generation of 80, 240, 600, 1200 or more frames with smooth transitions.

Text-to-Video Generation Video Generation

807

0.84 stars / hour

Paper
Code

AutoCodeRover: Autonomous Program Improvement

nus-apr/auto-code-rover • 8 Apr 2024

Recent progress in Large Language Models (LLMs) has significantly impacted the development process, where developers can use LLM-based programming assistants to achieve automated coding.

Bug fixing Code Search +1

1,950

0.76 stars / hour

Paper
Code

SoccerNet Game State Reconstruction: End-to-End Athlete Tracking and Identification on a Minimap

soccernet/sn-gamestate • • 17 Apr 2024

This tracking and identification process is crucial for reconstructing the game state, defined by the athletes' positions and identities on a 2D top-view of the pitch, (i. e. a minimap).

Camera Calibration

0.75 stars / hour

Paper
Code

InstantStyle: Free Lunch towards Style-Preserving in Text-to-Image Generation

instantstyle/instantstyle • • 3 Apr 2024

Tuning-free diffusion-based models have demonstrated significant potential in the realm of image personalization and customization.

Text-to-Image Generation

1,053

0.71 stars / hour

Paper
Code

Attention Satisfies: A Constraint-Satisfaction Lens on Factual Errors of Language Models

microsoft/mechanistic-error-probe • • 26 Sep 2023

We investigate the internal behavior of Transformer-based Large Language Models (LLMs) when they generate factually incorrect text.

0.69 stars / hour

Paper
Code

OneChart: Purify the Chart Structural Extraction via One Auxiliary Token

lingyvkong/onechart • • 15 Apr 2024

To address this, we propose OneChart: a reliable agent specifically devised for the structural extraction of chart information.

0.68 stars / hour

Paper
Code

Robust Speech Recognition via Large-Scale Weak Supervision

ggerganov/whisper.cpp • Preprint 2022

We study the capabilities of speech processing systems trained simply to predict large amounts of transcripts of audio on the internet.

Ranked #1 on Speech Recognition on Common Voice Italian (using extra training data)

Robust Speech Recognition speech-recognition

30,903

0.68 stars / hour

Paper
Code

ChangeMamba: Remote Sensing Change Detection with Spatio-Temporal State Space Model

chenhongruixuan/mambacd • 4 Apr 2024

For the change decoder, which is available in all three architectures, we propose three spatio-temporal relationship modeling mechanisms, which can be naturally combined with the Mamba architecture and fully utilize its attribute to achieve spatio-temporal interaction of multi-temporal features, thereby obtaining accurate change information.

Ranked #1 on 2D Semantic Segmentation on xBD

2D Semantic Segmentation Attribute +1

155

0.61 stars / hour

Paper
Code

TSMixer: Lightweight MLP-Mixer Model for Multivariate Time Series Forecasting

ibm/tsfm • 14 Jun 2023

TSMixer outperforms state-of-the-art MLP and Transformer models in forecasting by a considerable margin of 8-60%.

Multivariate Time Series Forecasting Representation Learning +2

152

0.57 stars / hour

Paper
Code