FAD

10 papers with code • 0 benchmarks • 0 datasets

This task has no description! Would you like to contribute one?

Benchmarks

Add a Result

These leaderboards are used to track progress in FAD

No evaluation results yet. Help compare methods by submitting evaluation metrics.

Most implemented papers

Most implemented Social Latest No code

Adapting Frechet Audio Distance for Generative Music Evaluation

microsoft/fadtk • • 2 Nov 2023

The growing popularity of generative music models underlines the need for perceptually relevant, objective music quality metrics.

Paper
Code

Twitch Plays Pokemon, Machine Learns Twitch: Unsupervised Context-Aware Anomaly Detection for Identifying Trolls in Streaming Data

ahaque/twitch-troll-detection • 17 Feb 2019

With the increasing importance of online communities, discussion forums, and customer reviews, Internet "trolls" have proliferated thereby making it difficult for information seekers to find relevant and correct information.

Paper
Code

Representation Sharing for Fast Object Detector Search and Beyond

MalongTech/research-fad • • ECCV 2020

FAD consists of a designed search space and an efficient architecture search algorithm.

Paper
Code

Timbre Transfer with Variational Auto Encoding and Cycle-Consistent Adversarial Networks

RussellSB/tt-vae-gan • • 5 Sep 2021

This research project investigates the application of deep learning to timbre transfer, where the timbre of a source audio can be converted to the timbre of a target audio with minimal loss in quality.

Paper
Code

Generating Diverse Vocal Bursts with StyleGAN2 and MEL-Spectrograms

marcojira/stylegan3-melspectrograms • • 25 Jun 2022

We describe our approach for the generative emotional vocal burst task (ExVo Generate) of the ICML Expressive Vocalizations Competition.

Paper
Code

Refined Semantic Enhancement towards Frequency Diffusion for Video Captioning

lzp870/rsfd • • 28 Nov 2022

In this paper, we introduce a novel Refined Semantic enhancement method towards Frequency Diffusion (RSFD), a captioning model that constantly perceives the linguistic representation of the infrequent tokens.

Paper
Code

MM-Diffusion: Learning Multi-Modal Diffusion Models for Joint Audio and Video Generation

researchmm/mm-diffusion • • CVPR 2023

To generate joint audio-video pairs, we propose a novel Multi-Modal Diffusion model (i. e., MM-Diffusion), with two-coupled denoising autoencoders.

Paper
Code

Adapting Offline Speech Translation Models for Streaming with Future-Aware Distillation and Inference

biaofuxmu/fast • 14 Mar 2023

A popular approach to streaming speech translation is to employ a single offline model with a wait-k policy to support different latency requirements, which is simpler than training multiple online models with different latency constraints.

Paper
Code

AMSP-UOD: When Vortex Convolution and Stochastic Perturbation Meet Underwater Object Detection

zhoujingchun03/amsp-uod • • 23 Aug 2023

In this paper, we present a novel Amplitude-Modulated Stochastic Perturbation and Vortex Convolutional Network, AMSP-UOD, designed for underwater object detection.

Paper
Code

Latent CLAP Loss for Better Foley Sound Synthesis

karchkha/latent-clap-subjective-evaluation • 18 Mar 2024

We introduce a new loss term to enhance Foley sound generation in AudioLDM without post-filtering.

Paper
Code

FAD

Benchmarks Add a Result

Most implemented papers

Content

Benchmarks

Add a Result