Search Results for author: Lars Maaløe

Found 20 papers, 8 papers with code

Recurrent Spatial Transformer Networks

2 code implementations • 17 Sep 2015 • Søren Kaae Sønderby, Casper Kaae Sønderby, Lars Maaløe, Ole Winther

We investigate different down-sampling factors (ratio of pixel in input and output) for the SPN and show that the RNN-SPN model is able to down-sample the input images without deteriorating performance.

Attribute

912

Paper
Code

Auxiliary Deep Generative Models

1 code implementation • 17 Feb 2016 • Lars Maaløe, Casper Kaae Sønderby, Søren Kaae Sønderby, Ole Winther

The auxiliary variables leave the generative model unchanged but make the variational distribution more expressive.

Ranked #49 on Image Classification on SVHN

108

Paper
Code

Ladder Variational Autoencoders

5 code implementations • NeurIPS 2016 • Casper Kaae Sønderby, Tapani Raiko, Lars Maaløe, Søren Kaae Sønderby, Ole Winther

Variational Autoencoders are powerful models for unsupervised learning.

Unsupervised MNIST

Paper
Code

BIVA: A Very Deep Hierarchy of Latent Variables for Generative Modeling

2 code implementations • NeurIPS 2019 • Lars Maaløe, Marco Fraccaro, Valentin Liévin, Ole Winther

In this paper we close the performance gap by constructing VAE models that can effectively utilize a deep hierarchy of stochastic variables and model complex covariance structures.

Ranked #18 on Image Generation on ImageNet 32x32 (bpd metric)

Anomaly Detection Attribute +1

Paper
Code

Hierarchical VAEs Know What They Don't Know

4 code implementations • 16 Feb 2021 • Jakob D. Havtorn, Jes Frellsen, Søren Hauberg, Lars Maaløe

Deep generative models have been demonstrated as state-of-the-art density estimators.

Out-of-Distribution Detection Out of Distribution (OOD) Detection

Paper
Code

Automated Medical Coding on MIMIC-III and MIMIC-IV: A Critical Review and Replicability Study

1 code implementation • 21 Apr 2023 • Joakim Edin, Alexander Junge, Jakob D. Havtorn, Lasse Borgholt, Maria Maistro, Tuukka Ruotsalo, Lars Maaløe

Medical coding is the task of assigning medical codes to clinical free-text documentation.

Ranked #1 on Medical Code Prediction on MIMIC-IV ICD-10

Medical Code Prediction

Paper
Code

Utilizing Domain Knowledge in End-to-End Audio Processing

1 code implementation • 1 Dec 2017 • Tycho Max Sylvester Tax, Jose Luis Diez Antich, Hendrik Purwins, Lars Maaløe

End-to-end neural network based approaches to audio modelling are generally outperformed by models trained on high-level data representations.

Environmental Sound Classification General Classification +1

Paper
Code

Exploiting Nontrivial Connectivity for Automatic Speech Recognition

no code implementations • 28 Nov 2017 • Marius Paraschiv, Lasse Borgholt, Tycho Max Sylvester Tax, Marco Singh, Lars Maaløe

Nontrivial connectivity has allowed the training of very deep networks by addressing the problem of vanishing gradients and offering a more efficient method of reusing parameters.

Automatic Speech Recognition Automatic Speech Recognition (ASR) +3

Paper
Add Code

Semi-Supervised Generation with Cluster-aware Generative Models

no code implementations • 3 Apr 2017 • Lars Maaløe, Marco Fraccaro, Ole Winther

Deep generative models trained with large amounts of unlabelled data have proven to be powerful within the domain of unsupervised learning.

Clustering General Classification

Paper
Add Code

Feature Map Variational Auto-Encoders

no code implementations • ICLR 2018 • Lars Maaløe, Ole Winther

There have been multiple attempts with variational auto-encoders (VAE) to learn powerful global representations of complex data using a combination of latent stochastic variables and an autoregressive model over the dimensions of the data.

Image Generation

Paper
Add Code

MultiQT: Multimodal Learning for Real-Time Question Tracking in Speech

no code implementations • ACL 2020 • Jakob D. Havtorn, Jan Latko, Joakim Edin, Lasse Borgholt, Lars Maaløe, Lorenzo Belgrano, Nicolai F. Jacobsen, Regitze Sdun, Željko Agić

We address a challenging and practical task of labeling questions in speech in real time during telephone calls to emergency medical services in English, which embeds within a broader decision support system for emergency call-takers.

Automatic Speech Recognition Automatic Speech Recognition (ASR) +1

Paper
Add Code

On Scaling Contrastive Representations for Low-Resource Speech Recognition

no code implementations • 1 Feb 2021 • Lasse Borgholt, Tycho Max Sylvester Tax, Jakob Drachmann Havtorn, Lars Maaløe, Christian Igel

We explore the performance of such systems without fine-tuning by training a state-of-the-art speech recognizer on the fixed representations from the computationally demanding wav2vec 2. 0 framework.

Self-Supervised Learning speech-recognition +1

Paper
Add Code

Do End-to-End Speech Recognition Models Care About Context?

no code implementations • 17 Feb 2021 • Lasse Borgholt, Jakob Drachmann Havtorn, Željko Agić, Anders Søgaard, Lars Maaløe, Christian Igel

We test this hypothesis by measuring temporal context sensitivity and evaluate how the models perform when we constrain the amount of contextual information in the audio input.

Language Modelling speech-recognition +1

Paper
Add Code

Towards Generative Latent Variable Models for Speech

no code implementations • 29 Sep 2021 • Jakob Drachmann Havtorn, Lasse Borgholt, Jes Frellsen, Søren Hauberg, Lars Maaløe

While stochastic latent variable models (LVMs) now achieve state-of-the-art performance on natural image generation, they are still inferior to deterministic models on speech.

Image Generation Video Generation

Paper
Add Code

Towards Hierarchical Discrete Variational Autoencoders

no code implementations • pproximateinference AABI Symposium 2019 • Valentin Liévin, Andrea Dittadi, Lars Maaløe, Ole Winther

We introduce the Hierarchical Discrete Variational Autoencoder (HD-VAE): a hi- erarchy of variational memory layers.

Paper
Add Code

Do We Still Need Automatic Speech Recognition for Spoken Language Understanding?

no code implementations • 29 Nov 2021 • Lasse Borgholt, Jakob Drachmann Havtorn, Mostafa Abdou, Joakim Edin, Lars Maaløe, Anders Søgaard, Christian Igel

We compare learned speech features from wav2vec 2. 0, state-of-the-art ASR transcripts, and the ground truth text as input for a novel speech-based named entity recognition task, a cardiac arrest detection task on real-world emergency calls and two existing SLU benchmarks.

Ranked #7 on Spoken Language Understanding on Fluent Speech Commands (using extra training data)

Automatic Speech Recognition Automatic Speech Recognition (ASR) +8

Paper
Add Code

Benchmarking Generative Latent Variable Models for Speech

1 code implementation • 22 Feb 2022 • Jakob D. Havtorn, Lasse Borgholt, Søren Hauberg, Jes Frellsen, Lars Maaløe

Stochastic latent variable models (LVMs) achieve state-of-the-art performance on natural image generation but are still inferior to deterministic models on speech.

Benchmarking Image Generation +1

Paper
Code

Model-agnostic out-of-distribution detection using combined statistical tests

no code implementations • 2 Mar 2022 • Federico Bergamin, Pierre-Alexandre Mattei, Jakob D. Havtorn, Hugo Senetaire, Hugo Schmutz, Lars Maaløe, Søren Hauberg, Jes Frellsen

These techniques, based on classical statistical tests, are model-agnostic in the sense that they can be applied to any differentiable generative model.

Out-of-Distribution Detection

Paper
Add Code

A Brief Overview of Unsupervised Neural Speech Representation Learning

no code implementations • 1 Mar 2022 • Lasse Borgholt, Jakob Drachmann Havtorn, Joakim Edin, Lars Maaløe, Christian Igel

Unsupervised representation learning for speech processing has matured greatly in the last few years.

Representation Learning

Paper
Add Code

Self-Supervised Speech Representation Learning: A Review

no code implementations • 21 May 2022 • Abdelrahman Mohamed, Hung-Yi Lee, Lasse Borgholt, Jakob D. Havtorn, Joakim Edin, Christian Igel, Katrin Kirchhoff, Shang-Wen Li, Karen Livescu, Lars Maaløe, Tara N. Sainath, Shinji Watanabe

Although self-supervised speech representation is still a nascent research area, it is closely related to acoustic word embedding and learning with zero lexical resources, both of which have seen active research for many years.

Automatic Speech Recognition Automatic Speech Recognition (ASR) +3

Paper
Add Code

Cannot find the paper you are looking for? You can Submit a new open access paper.