Search Results for author: Jan Chorowski

Found 26 papers, 19 papers with code

Contrastive prediction strategies for unsupervised segmentation and categorization of phonemes and words

1 code implementation29 Oct 2021 Santiago Cuervo, Maciej Grabias, Jan Chorowski, Grzegorz Ciesielski, Adrian Łańcucki, Paweł Rychlikowski, Ricard Marxer

We investigate the performance on phoneme categorization and phoneme and word segmentation of several self-supervised learning (SSL) methods based on Contrastive Predictive Coding (CPC).

Frame Self-Supervised Learning

Aligned Contrastive Predictive Coding

1 code implementation24 Apr 2021 Jan Chorowski, Grzegorz Ciesielski, Jarosław Dzikowski, Adrian Łańcucki, Ricard Marxer, Mateusz Opala, Piotr Pusz, Paweł Rychlikowski, Michał Stypułkowski

We investigate the possibility of forcing a self-supervised model trained using a contrastive predictive loss to extract slowly varying latent representations.

Representing Point Clouds with Generative Conditional Invertible Flow Networks

1 code implementation7 Oct 2020 Michał Stypułkowski, Kacper Kania, Maciej Zamorski, Maciej Zięba, Tomasz Trzciński, Jan Chorowski

To exploit similarities between same-class objects and to improve model performance, we turn to weight sharing: networks that model densities of points belonging to objects in the same family share all parameters with the exception of a small, object-specific embedding vector.

Point Cloud Registration

A Convolutional Deep Markov Model for Unsupervised Speech Representation Learning

no code implementations3 Jun 2020 Sameer Khurana, Antoine Laurent, Wei-Ning Hsu, Jan Chorowski, Adrian Lancucki, Ricard Marxer, James Glass

Probabilistic Latent Variable Models (LVMs) provide an alternative to self-supervised learning approaches for linguistic representation learning from speech.

Representation Learning Self-Supervised Learning +1

Conditional Invertible Flow for Point Cloud Generation

2 code implementations16 Oct 2019 Michał Stypułkowski, Maciej Zamorski, Maciej Zięba, Jan Chorowski

This paper focuses on a novel generative approach for 3D point clouds that makes use of invertible flow-based models.

Point Cloud Generation

Lingvo: a Modular and Scalable Framework for Sequence-to-Sequence Modeling

3 code implementations21 Feb 2019 Jonathan Shen, Patrick Nguyen, Yonghui Wu, Zhifeng Chen, Mia X. Chen, Ye Jia, Anjuli Kannan, Tara Sainath, Yuan Cao, Chung-Cheng Chiu, Yanzhang He, Jan Chorowski, Smit Hinsu, Stella Laurenzo, James Qin, Orhan Firat, Wolfgang Macherey, Suyog Gupta, Ankur Bapna, Shuyuan Zhang, Ruoming Pang, Ron J. Weiss, Rohit Prabhavalkar, Qiao Liang, Benoit Jacob, Bowen Liang, HyoukJoong Lee, Ciprian Chelba, Sébastien Jean, Bo Li, Melvin Johnson, Rohan Anil, Rajat Tibrewal, Xiaobing Liu, Akiko Eriguchi, Navdeep Jaitly, Naveen Ari, Colin Cherry, Parisa Haghani, Otavio Good, Youlong Cheng, Raziel Alvarez, Isaac Caswell, Wei-Ning Hsu, Zongheng Yang, Kuan-Chieh Wang, Ekaterina Gonina, Katrin Tomanek, Ben Vanik, Zelin Wu, Llion Jones, Mike Schuster, Yanping Huang, Dehao Chen, Kazuki Irie, George Foster, John Richardson, Klaus Macherey, Antoine Bruguier, Heiga Zen, Colin Raffel, Shankar Kumar, Kanishka Rao, David Rybach, Matthew Murray, Vijayaditya Peddinti, Maxim Krikun, Michiel A. U. Bacchiani, Thomas B. Jablin, Rob Suderman, Ian Williams, Benjamin Lee, Deepti Bhatia, Justin Carlson, Semih Yavuz, Yu Zhang, Ian McGraw, Max Galkin, Qi Ge, Golan Pundak, Chad Whipkey, Todd Wang, Uri Alon, Dmitry Lepikhin, Ye Tian, Sara Sabour, William Chan, Shubham Toshniwal, Baohua Liao, Michael Nirschl, Pat Rondon

Lingvo is a Tensorflow framework offering a complete solution for collaborative deep learning research, with a particular focus towards sequence-to-sequence models.

Sequence-To-Sequence Speech Recognition

Unsupervised speech representation learning using WaveNet autoencoders

6 code implementations25 Jan 2019 Jan Chorowski, Ron J. Weiss, Samy Bengio, Aäron van den Oord

We consider the task of unsupervised extraction of meaningful latent representations of speech by applying autoencoding neural networks to speech waveforms.

Acoustic Unit Discovery Dimensionality Reduction +1

Towards Using Context-Dependent Symbols in CTC Without State-Tying Decision Trees

no code implementations14 Jan 2019 Jan Chorowski, Adrian Lancucki, Bartosz Kostka, Michal Zapotoczny

The embedding network is trained together with the rest of the acoustic model and removes one of the last cases in which neural systems have to be bootstrapped from GMM-HMM ones.

Frame Language Modelling

Efficient Purely Convolutional Text Encoding

1 code implementation3 Aug 2018 Szymon Malik, Adrian Lancucki, Jan Chorowski

In this work, we focus on a lightweight convolutional architecture that creates fixed-size vector embeddings of sentences.

A Talker Ensemble: the University of Wrocław's Entry to the NIPS 2017 Conversational Intelligence Challenge

no code implementations21 May 2018 Jan Chorowski, Adrian Łańcucki, Szymon Malik, Maciej Pawlikowski, Paweł Rychlikowski, Paweł Zykowski

We present Poetwannabe, a chatbot submitted by the University of Wroc{\l}aw to the NIPS 2017 Conversational Intelligence Challenge, in which it ranked first ex-aequo.

Chatbot Question Answering

On Using Backpropagation for Speech Texture Generation and Voice Conversion

no code implementations22 Dec 2017 Jan Chorowski, Ron J. Weiss, Rif A. Saurous, Samy Bengio

Inspired by recent work on neural network image generation which rely on backpropagation towards the network inputs, we present a proof-of-concept system for speech texture synthesis and voice conversion based on two mechanisms: approximate inversion of the representation learned by a speech recognition neural network, and on matching statistics of neuron activations between different source and target utterances.

Image Generation Speech Recognition +3

State-of-the-art Speech Recognition With Sequence-to-Sequence Models

4 code implementations5 Dec 2017 Chung-Cheng Chiu, Tara N. Sainath, Yonghui Wu, Rohit Prabhavalkar, Patrick Nguyen, Zhifeng Chen, Anjuli Kannan, Ron J. Weiss, Kanishka Rao, Ekaterina Gonina, Navdeep Jaitly, Bo Li, Jan Chorowski, Michiel Bacchiani

Attention-based encoder-decoder architectures such as Listen, Attend, and Spell (LAS), subsume the acoustic, pronunciation and language model components of a traditional automatic speech recognition (ASR) system into a single neural network.

Automatic Speech Recognition

On Multilingual Training of Neural Dependency Parsers

1 code implementation29 May 2017 Michał Zapotoczny, Paweł Rychlikowski, Jan Chorowski

We analyze the representations of characters and words that are learned by the network to establish which properties of languages were accounted for.

Sequence-to-Sequence Models Can Directly Translate Foreign Speech

1 code implementation24 Mar 2017 Ron J. Weiss, Jan Chorowski, Navdeep Jaitly, Yonghui Wu, Zhifeng Chen

We present a recurrent encoder-decoder deep neural network architecture that directly translates speech in one language into text in another.

Machine Translation Sequence-To-Sequence Speech Recognition +1

Towards better decoding and language model integration in sequence to sequence models

no code implementations8 Dec 2016 Jan Chorowski, Navdeep Jaitly

The recently proposed Sequence-to-Sequence (seq2seq) framework advocates replacing complex data processing pipelines, such as an entire automatic speech recognition system, with a single neural network trained in an end-to-end fashion.

Automatic Speech Recognition

Read, Tag, and Parse All at Once, or Fully-neural Dependency Parsing

1 code implementation12 Sep 2016 Jan Chorowski, Michał Zapotoczny, Paweł Rychlikowski

We present a dependency parser implemented as a single deep neural network that reads orthographic representations of words and directly generates dependencies and their labels.

Dependency Parsing Part-Of-Speech Tagging +2

Theano: A Python framework for fast computation of mathematical expressions

1 code implementation9 May 2016 The Theano Development Team, Rami Al-Rfou, Guillaume Alain, Amjad Almahairi, Christof Angermueller, Dzmitry Bahdanau, Nicolas Ballas, Frédéric Bastien, Justin Bayer, Anatoly Belikov, Alexander Belopolsky, Yoshua Bengio, Arnaud Bergeron, James Bergstra, Valentin Bisson, Josh Bleecher Snyder, Nicolas Bouchard, Nicolas Boulanger-Lewandowski, Xavier Bouthillier, Alexandre de Brébisson, Olivier Breuleux, Pierre-Luc Carrier, Kyunghyun Cho, Jan Chorowski, Paul Christiano, Tim Cooijmans, Marc-Alexandre Côté, Myriam Côté, Aaron Courville, Yann N. Dauphin, Olivier Delalleau, Julien Demouth, Guillaume Desjardins, Sander Dieleman, Laurent Dinh, Mélanie Ducoffe, Vincent Dumoulin, Samira Ebrahimi Kahou, Dumitru Erhan, Ziye Fan, Orhan Firat, Mathieu Germain, Xavier Glorot, Ian Goodfellow, Matt Graham, Caglar Gulcehre, Philippe Hamel, Iban Harlouchet, Jean-Philippe Heng, Balázs Hidasi, Sina Honari, Arjun Jain, Sébastien Jean, Kai Jia, Mikhail Korobov, Vivek Kulkarni, Alex Lamb, Pascal Lamblin, Eric Larsen, César Laurent, Sean Lee, Simon Lefrancois, Simon Lemieux, Nicholas Léonard, Zhouhan Lin, Jesse A. Livezey, Cory Lorenz, Jeremiah Lowin, Qianli Ma, Pierre-Antoine Manzagol, Olivier Mastropietro, Robert T. McGibbon, Roland Memisevic, Bart van Merriënboer, Vincent Michalski, Mehdi Mirza, Alberto Orlandi, Christopher Pal, Razvan Pascanu, Mohammad Pezeshki, Colin Raffel, Daniel Renshaw, Matthew Rocklin, Adriana Romero, Markus Roth, Peter Sadowski, John Salvatier, François Savard, Jan Schlüter, John Schulman, Gabriel Schwartz, Iulian Vlad Serban, Dmitriy Serdyuk, Samira Shabanian, Étienne Simon, Sigurd Spieckermann, S. Ramana Subramanyam, Jakub Sygnowski, Jérémie Tanguay, Gijs van Tulder, Joseph Turian, Sebastian Urban, Pascal Vincent, Francesco Visin, Harm de Vries, David Warde-Farley, Dustin J. Webb, Matthew Willson, Kelvin Xu, Lijun Xue, Li Yao, Saizheng Zhang, Ying Zhang

Since its introduction, it has been one of the most used CPU and GPU mathematical compilers - especially in the machine learning community - and has shown steady performance improvements.

Dimensionality Reduction General Classification

Task Loss Estimation for Sequence Prediction

1 code implementation19 Nov 2015 Dzmitry Bahdanau, Dmitriy Serdyuk, Philémon Brakel, Nan Rosemary Ke, Jan Chorowski, Aaron Courville, Yoshua Bengio

Our idea is that this score can be interpreted as an estimate of the task loss, and that the estimation error may be used as a consistent surrogate loss.

Speech Recognition

End-to-End Attention-based Large Vocabulary Speech Recognition

1 code implementation18 Aug 2015 Dzmitry Bahdanau, Jan Chorowski, Dmitriy Serdyuk, Philemon Brakel, Yoshua Bengio

Many of the current state-of-the-art Large Vocabulary Continuous Speech Recognition Systems (LVCSR) are hybrids of neural networks and Hidden Markov Models (HMMs).

Acoustic Modelling Speech Recognition

Attention-Based Models for Speech Recognition

13 code implementations NeurIPS 2015 Jan Chorowski, Dzmitry Bahdanau, Dmitriy Serdyuk, Kyunghyun Cho, Yoshua Bengio

Recurrent sequence generators conditioned on input data through an attention mechanism have recently shown very good performance on a range of tasks in- cluding machine translation, handwriting synthesis and image caption gen- eration.

Machine Translation Speech Recognition +1

Blocks and Fuel: Frameworks for deep learning

5 code implementations1 Jun 2015 Bart van Merriënboer, Dzmitry Bahdanau, Vincent Dumoulin, Dmitriy Serdyuk, David Warde-Farley, Jan Chorowski, Yoshua Bengio

We introduce two Python frameworks to train neural networks on large datasets: Blocks and Fuel.

End-to-end Continuous Speech Recognition using Attention-based Recurrent NN: First Results

no code implementations4 Dec 2014 Jan Chorowski, Dzmitry Bahdanau, Kyunghyun Cho, Yoshua Bengio

We replace the Hidden Markov Model (HMM) which is traditionally used in in continuous speech recognition with a bi-directional recurrent neural network encoder coupled to a recurrent neural network decoder that directly emits a stream of phonemes.

Speech Recognition

Cannot find the paper you are looking for? You can Submit a new open access paper.