Trending Research

HippoRAG: Neurobiologically Inspired Long-Term Memory for Large Language Models

osu-nlp-group/hipporag • 23 May 2024

In order to thrive in hostile and ever-changing natural environments, mammalian brains evolved to store large amounts of knowledge about the world and continually integrate new information while avoiding catastrophic forgetting.

Hippocampus Knowledge Graphs +3

552

0.54 stars / hour

Paper
Code

MotionFollower: Editing Video Motion via Lightweight Score-Guided Diffusion

Francis-Rings/MotionFollower • 30 May 2024

In this paper, we propose MotionFollower, a lightweight score-guided diffusion model for video motion editing.

Denoising Image Animation +2

106

0.53 stars / hour

Paper
Code

VideoTetris: Towards Compositional Text-to-Video Generation

yangling0818/videotetris • 6 Jun 2024

Diffusion models have demonstrated great success in text-to-video (T2V) generation.

Denoising Text-to-Video Generation +1

0.53 stars / hour

Paper
Code

StreamSpeech: Simultaneous Speech-to-Speech Translation with Multi-task Learning

ictnlp/streamspeech • • 5 Jun 2024

Simultaneous speech-to-speech translation (Simul-S2ST, a. k. a streaming speech translation) outputs target speech while receiving streaming speech inputs, which is critical for real-time communication.

Ranked #1 on de-en on CVSS

Automatic Speech Recognition (ASR) de-en +11

269

0.52 stars / hour

Paper
Code

On the Measure of Intelligence

fchollet/ARC • 5 Nov 2019

To make deliberate progress towards more intelligent and more human-like artificial systems, we need to be following an appropriate feedback signal: we need to be able to define and evaluate intelligence in a way that enables comparisons between two systems, as well as comparisons with humans.

Benchmarking

2,452

0.52 stars / hour

Paper
Code

MuSc: Zero-Shot Industrial Anomaly Classification and Segmentation with Mutual Scoring of the Unlabeled Images

xrli-U/MuSc • • 30 Jan 2024

We reveal that the abundant normal and abnormal cues implicit in unlabeled test images can be exploited for anomaly determination, which is ignored by prior methods.

Anomaly Classification

136

0.48 stars / hour

Paper
Code

LLaVA-UHD: an LMM Perceiving Any Aspect Ratio and High-Resolution Images

openbmb/minicpm-v • • 18 Mar 2024

To address the challenges, we present LLaVA-UHD, a large multimodal model that can efficiently perceive images in any aspect ratio and high resolution.

7,114

0.47 stars / hour

Paper
Code

Real-time Transformer-based Open-Vocabulary Detection with Efficient Fusion Head

om-ai-lab/OmDet • • 11 Mar 2024

End-to-end transformer-based detectors (DETRs) have shown exceptional performance in both closed-set and open-vocabulary object detection (OVD) tasks through the integration of language modalities.

Open Vocabulary Object Detection Real-Time Object Detection

606

0.46 stars / hour

Paper
Code

One-Step Effective Diffusion Network for Real-World Image Super-Resolution

cswry/osediff • 12 Jun 2024

Our experiments demonstrate that OSEDiff achieves comparable or even better Real-ISR results, in terms of both objective metrics and subjective evaluations, than previous diffusion model based Real-ISR methods that require dozens or hundreds of steps.

Image Restoration Image Super-Resolution

0.46 stars / hour

Paper
Code

Switch Transformers: Scaling to Trillion Parameter Models with Simple and Efficient Sparsity

vikparuchuri/marker • • 11 Jan 2021

We design models based off T5-Base and T5-Large to obtain up to 7x increases in pre-training speed with the same computational resources.

Language Modelling Question Answering

12,422

0.45 stars / hour

Paper
Code