FreGrad: Lightweight and Fast Frequency-aware Diffusion Vocoder

signofthefour/fregrad 18 Jan 2024

The goal of this paper is to generate realistic audio with a lightweight and fast diffusion-based vocoder, named FreGrad.

45
0.32 stars / hour

Taming Stable Diffusion for Text to 360° Panorama Image Generation

chengzhag/panfusion 11 Apr 2024

Generative models, e. g., Stable Diffusion, have enabled the creation of photorealistic images from text prompts.

Denoising Image Generation

96
0.31 stars / hour

Prompts As Programs: A Structure-Aware Approach to Efficient Compile-Time Prompt Optimization

microsoft/sammo 2 Apr 2024

We show that SAMMO generalizes previous methods and improves the performance of complex prompts on (1) instruction tuning, (2) RAG pipeline tuning, and (3) prompt compression, across several different LLMs.

105
0.30 stars / hour

TFB: Towards Comprehensive and Fair Benchmarking of Time Series Forecasting Methods

decisionintelligence/tfb 29 Mar 2024

Next, we employ TFB to perform a thorough evaluation of 21 Univariate Time Series Forecasting (UTSF) methods on 8, 068 univariate time series and 14 Multivariate Time Series Forecasting (MTSF) methods on 25 datasets.

Benchmarking Multivariate Time Series Forecasting +2

124
0.29 stars / hour

AniPortrait: Audio-Driven Synthesis of Photorealistic Portrait Animation

scutzzj/aniportrait 26 Mar 2024

In this study, we propose AniPortrait, a novel framework for generating high-quality animation driven by audio and a reference portrait image.

Face Reenactment

3,579
0.29 stars / hour

DUSt3R: Geometric 3D Vision Made Easy

naver/dust3r 21 Dec 2023

Our formulation directly provides a 3D model of the scene as well as depth information, but interestingly, we can seamlessly recover from it, pixel matches, relative and absolute camera.

3D Reconstruction Camera Calibration +2

4,144
0.28 stars / hour

emotion2vec: Self-Supervised Pre-Training for Speech Emotion Representation

alibaba-damo-academy/FunASR 23 Dec 2023

To the best of our knowledge, emotion2vec is the first universal representation model in various emotion-related tasks, filling a gap in the field.

Self-Supervised Learning Sentiment Analysis +1

3,189
0.28 stars / hour

BGE M3-Embedding: Multi-Lingual, Multi-Functionality, Multi-Granularity Text Embeddings Through Self-Knowledge Distillation

flagopen/flagembedding 5 Feb 2024

It can simultaneously perform the three common retrieval functionalities of embedding model: dense retrieval, multi-vector retrieval, and sparse retrieval, which provides a unified model foundation for real-world IR applications.

Retrieval Self-Knowledge Distillation

4,814
0.28 stars / hour

A Survey on Deep Learning for Theorem Proving

zhaoyu-li/dl4tp 15 Apr 2024

Theorem proving is a fundamental aspect of mathematics, spanning from informal reasoning in mathematical language to rigorous derivations in formal systems.

Automated Theorem Proving

52
0.28 stars / hour

SchurVINS: Schur Complement-Based Lightweight Visual Inertial Navigation System

bytedance/schurvins 4 Dec 2023

To this end, we propose a novel filter-based VINS framework named SchurVINS, which could guarantee both high accuracy by building a complete residual model and low computational complexity with Schur complement.

Computational Efficiency

244
0.28 stars / hour