TASK	DATASET	MODEL	METRIC NAME	METRIC VALUE	GLOBAL RANK	EXTRA DATA	REMOVE
Audio Super-Resolution	Piano	U-Net + TFiLM	Log-Spectral Distance	2	# 2
Audio Super-Resolution	Voice Bank corpus (VCTK)	U-Net + TFiLM	Log-Spectral Distance	2.5	# 2

Badge	Markdown
	`[![PWC](https://img.shields.io/endpoint.svg?url=https://paperswithcode.com/badge/temporal-film-capturing-long-range-sequence/audio-super-resolution-on-piano-1)](https://paperswithcode.com/sota/audio-super-resolution-on-piano-1?p=temporal-film-capturing-long-range-sequence)`
	`[![PWC](https://img.shields.io/endpoint.svg?url=https://paperswithcode.com/badge/temporal-film-capturing-long-range-sequence/audio-super-resolution-on-voice-bank-corpus-1)](https://paperswithcode.com/sota/audio-super-resolution-on-voice-bank-corpus-1?p=temporal-film-capturing-long-range-sequence)`

Temporal FiLM: Capturing Long-Range Sequence Dependencies with Feature-Wise Modulations

14 Sep 2019 · Sawyer Birnbaum, Volodymyr Kuleshov, Zayd Enam, Pang Wei Koh, Stefano Ermon ·

Learning representations that accurately capture long-range dependencies in sequential inputs -- including text, audio, and genomic data -- is a key problem in deep learning. Feed-forward convolutional models capture only feature interactions within finite receptive fields while recurrent architectures can be slow and difficult to train due to vanishing gradients. Here, we propose Temporal Feature-Wise Linear Modulation (TFiLM) -- a novel architectural component inspired by adaptive batch normalization and its extensions -- that uses a recurrent neural network to alter the activations of a convolutional model. This approach expands the receptive field of convolutional sequence models with minimal computational overhead. Empirically, we find that TFiLM significantly improves the learning speed and accuracy of feed-forward neural networks on a range of generative and discriminative learning tasks, including text classification and audio super-resolution

PDF Abstract

Code

Add Remove Mark official

leolya/Audio-Super-Resolution-Tenso…

Tasks

Add Remove

Audio Super-Resolution

Super-Resolution

text-classification

Text Classification

Datasets

VCTK

Results from the Paper

Edit

Ranked #2 on Audio Super-Resolution on Voice Bank corpus (VCTK) (using extra training data)

Get a GitHub badge

Task	Dataset	Model	Metric Name	Metric Value	Global Rank	Uses Extra Training Data	Result	Benchmark
Audio Super-Resolution	Piano	U-Net + TFiLM	Log-Spectral Distance	2	# 2			Compare
Audio Super-Resolution	Voice Bank corpus (VCTK)	U-Net + TFiLM	Log-Spectral Distance	2.5	# 2			Compare

Methods

Add Remove

Batch Normalization • SPEED

Edit Social Preview

Temporal FiLM: Capturing Long-Range Sequence Dependencies with Feature-Wise Modulations

Code Edit Add Remove Mark official

Tasks Edit Add Remove

Datasets Edit

Results from the Paper Edit

Methods Edit Add Remove

Code

Add Remove Mark official

Tasks

Add Remove

Datasets

Results from the Paper

Edit

Methods

Add Remove