Search Results for author: Alexandre de Brébisson

Found 7 papers, 5 papers with code

A Cheap Linear Attention Mechanism with Fast Lookups and Fixed-Size Representations

no code implementations19 Sep 2016 Alexandre de Brébisson, Pascal Vincent

These two limitations restrict the use of the softmax attention mechanism to relatively small-scale applications with short sequences and few lookups per sequence.

Question Answering

Exact gradient updates in time independent of output size for the spherical loss family

no code implementations26 Jun 2016 Pascal Vincent, Alexandre de Brébisson, Xavier Bouthillier

An important class of problems involves training deep neural networks with sparse prediction targets of very high dimension D. These occur naturally in e. g. neural language models or the learning of word-embeddings, often posed as predicting the probability of next words among a vocabulary of size D (e. g. 200, 000).

Word Embeddings

Theano: A Python framework for fast computation of mathematical expressions

1 code implementation9 May 2016 The Theano Development Team, Rami Al-Rfou, Guillaume Alain, Amjad Almahairi, Christof Angermueller, Dzmitry Bahdanau, Nicolas Ballas, Frédéric Bastien, Justin Bayer, Anatoly Belikov, Alexander Belopolsky, Yoshua Bengio, Arnaud Bergeron, James Bergstra, Valentin Bisson, Josh Bleecher Snyder, Nicolas Bouchard, Nicolas Boulanger-Lewandowski, Xavier Bouthillier, Alexandre de Brébisson, Olivier Breuleux, Pierre-Luc Carrier, Kyunghyun Cho, Jan Chorowski, Paul Christiano, Tim Cooijmans, Marc-Alexandre Côté, Myriam Côté, Aaron Courville, Yann N. Dauphin, Olivier Delalleau, Julien Demouth, Guillaume Desjardins, Sander Dieleman, Laurent Dinh, Mélanie Ducoffe, Vincent Dumoulin, Samira Ebrahimi Kahou, Dumitru Erhan, Ziye Fan, Orhan Firat, Mathieu Germain, Xavier Glorot, Ian Goodfellow, Matt Graham, Caglar Gulcehre, Philippe Hamel, Iban Harlouchet, Jean-Philippe Heng, Balázs Hidasi, Sina Honari, Arjun Jain, Sébastien Jean, Kai Jia, Mikhail Korobov, Vivek Kulkarni, Alex Lamb, Pascal Lamblin, Eric Larsen, César Laurent, Sean Lee, Simon Lefrancois, Simon Lemieux, Nicholas Léonard, Zhouhan Lin, Jesse A. Livezey, Cory Lorenz, Jeremiah Lowin, Qianli Ma, Pierre-Antoine Manzagol, Olivier Mastropietro, Robert T. McGibbon, Roland Memisevic, Bart van Merriënboer, Vincent Michalski, Mehdi Mirza, Alberto Orlandi, Christopher Pal, Razvan Pascanu, Mohammad Pezeshki, Colin Raffel, Daniel Renshaw, Matthew Rocklin, Adriana Romero, Markus Roth, Peter Sadowski, John Salvatier, François Savard, Jan Schlüter, John Schulman, Gabriel Schwartz, Iulian Vlad Serban, Dmitriy Serdyuk, Samira Shabanian, Étienne Simon, Sigurd Spieckermann, S. Ramana Subramanyam, Jakub Sygnowski, Jérémie Tanguay, Gijs van Tulder, Joseph Turian, Sebastian Urban, Pascal Vincent, Francesco Visin, Harm de Vries, David Warde-Farley, Dustin J. Webb, Matthew Willson, Kelvin Xu, Lijun Xue, Li Yao, Saizheng Zhang, Ying Zhang

Since its introduction, it has been one of the most used CPU and GPU mathematical compilers - especially in the machine learning community - and has shown steady performance improvements.

BIG-bench Machine Learning Clustering +2

The Z-loss: a shift and scale invariant classification loss belonging to the Spherical Family

1 code implementation29 Apr 2016 Alexandre de Brébisson, Pascal Vincent

In this paper, we introduce an alternative classification loss function, the Z-loss, which is designed to address these two issues.

General Classification Language Modelling

An Exploration of Softmax Alternatives Belonging to the Spherical Loss Family

1 code implementation16 Nov 2015 Alexandre de Brébisson, Pascal Vincent

In particular, we focus our investigation on spherical bounds of the log-softmax loss and on two spherical log-likelihood losses, namely the log-Spherical Softmax suggested by Vincent et al. (2015) and the log-Taylor Softmax that we introduce.

Language Modelling Multi-class Classification

Artificial Neural Networks Applied to Taxi Destination Prediction

1 code implementation31 Jul 2015 Alexandre de Brébisson, Étienne Simon, Alex Auvolat, Pascal Vincent, Yoshua Bengio

We describe our first-place solution to the ECML/PKDD discovery challenge on taxi destination prediction.

Efficient Exact Gradient Update for training Deep Networks with Very Large Sparse Targets

1 code implementation NeurIPS 2015 Pascal Vincent, Alexandre de Brébisson, Xavier Bouthillier

An important class of problems involves training deep neural networks with sparse prediction targets of very high dimension D. These occur naturally in e. g. neural language models or the learning of word-embeddings, often posed as predicting the probability of next words among a vocabulary of size D (e. g. 200 000).

Word Embeddings

Cannot find the paper you are looking for? You can Submit a new open access paper.