Trending Research

FreGrad: Lightweight and Fast Frequency-aware Diffusion Vocoder

signofthefour/fregrad • • 18 Jan 2024

The goal of this paper is to generate realistic audio with a lightweight and fast diffusion-based vocoder, named FreGrad.

0.32 stars / hour

Paper
Code

Taming Stable Diffusion for Text to 360° Panorama Image Generation

chengzhag/panfusion • • 11 Apr 2024

Generative models, e. g., Stable Diffusion, have enabled the creation of photorealistic images from text prompts.

Denoising Image Generation

0.31 stars / hour

Paper
Code

Prompts As Programs: A Structure-Aware Approach to Efficient Compile-Time Prompt Optimization

microsoft/sammo • 2 Apr 2024

We show that SAMMO generalizes previous methods and improves the performance of complex prompts on (1) instruction tuning, (2) RAG pipeline tuning, and (3) prompt compression, across several different LLMs.

105

0.30 stars / hour

Paper
Code

TFB: Towards Comprehensive and Fair Benchmarking of Time Series Forecasting Methods

decisionintelligence/tfb • • 29 Mar 2024

Next, we employ TFB to perform a thorough evaluation of 21 Univariate Time Series Forecasting (UTSF) methods on 8, 068 univariate time series and 14 Multivariate Time Series Forecasting (MTSF) methods on 25 datasets.

Benchmarking Multivariate Time Series Forecasting +2

124

0.29 stars / hour

Paper
Code

AniPortrait: Audio-Driven Synthesis of Photorealistic Portrait Animation

scutzzj/aniportrait • • 26 Mar 2024

In this study, we propose AniPortrait, a novel framework for generating high-quality animation driven by audio and a reference portrait image.

Face Reenactment

3,579

0.29 stars / hour

Paper
Code

DUSt3R: Geometric 3D Vision Made Easy

naver/dust3r • • 21 Dec 2023

Our formulation directly provides a 3D model of the scene as well as depth information, but interestingly, we can seamlessly recover from it, pixel matches, relative and absolute camera.

3D Reconstruction Camera Calibration +2

4,144

0.28 stars / hour

Paper
Code

emotion2vec: Self-Supervised Pre-Training for Speech Emotion Representation

alibaba-damo-academy/FunASR • • 23 Dec 2023

To the best of our knowledge, emotion2vec is the first universal representation model in various emotion-related tasks, filling a gap in the field.

Self-Supervised Learning Sentiment Analysis +1

3,189

0.28 stars / hour

Paper
Code

BGE M3-Embedding: Multi-Lingual, Multi-Functionality, Multi-Granularity Text Embeddings Through Self-Knowledge Distillation

flagopen/flagembedding • • 5 Feb 2024

It can simultaneously perform the three common retrieval functionalities of embedding model: dense retrieval, multi-vector retrieval, and sparse retrieval, which provides a unified model foundation for real-world IR applications.

Retrieval Self-Knowledge Distillation

4,814

0.28 stars / hour

Paper
Code

A Survey on Deep Learning for Theorem Proving

zhaoyu-li/dl4tp • 15 Apr 2024

Theorem proving is a fundamental aspect of mathematics, spanning from informal reasoning in mathematical language to rigorous derivations in formal systems.

Automated Theorem Proving

0.28 stars / hour

Paper
Code

SchurVINS: Schur Complement-Based Lightweight Visual Inertial Navigation System

bytedance/schurvins • 4 Dec 2023

To this end, we propose a novel filter-based VINS framework named SchurVINS, which could guarantee both high accuracy by building a complete residual model and low computational complexity with Schur complement.

Computational Efficiency

244

0.28 stars / hour

Paper
Code