no code implementations • 13 Nov 2024 • Sanjay Haresh, Daniel Dijkman, Apratim Bhattacharyya, Roland Memisevic
Robotics tasks are highly compositional by nature.
no code implementations • 23 Oct 2024 • Ashish Khisti, M. Reza Ebrahimi, Hassan Dbouk, Arash Behboodi, Roland Memisevic, Christos Louizos
In this work we show that the optimal scheme can be decomposed into a two-step solution: in the first step an importance sampling (IS) type scheme is used to select one intermediate token; in the second step (single-draft) speculative sampling is applied to generate the output token.
no code implementations • 3 Oct 2024 • Rishit Dagli, Guillaume Berger, Joanna Materzynska, Ingo Bax, Roland Memisevic
We introduce AirLetters, a new video dataset consisting of real-world videos of human-generated, articulated motions.
no code implementations • 10 Aug 2024 • MohammadReza Ebrahimi, Sunny Panchal, Roland Memisevic
Despite their recent successes, Transformer-based large language models show surprising failure modes.
1 code implementation • 11 Jul 2024 • Sunny Panchal, Apratim Bhattacharyya, Guillaume Berger, Antoine Mercier, Cornelius Bohm, Florian Dietrichkeit, Reza Pourreza, Xuanlin Li, Pulkit Madan, Mingu Lee, Mark Todorovich, Ingo Bax, Roland Memisevic
The benchmark requires vision-language models to recognize complex human actions, identify possible mistakes, and provide appropriate feedback in real-time.
1 code implementation • 1 Nov 2023 • Zhan Ling, Yunhao Fang, Xuanlin Li, Tongzhou Mu, Mingu Lee, Reza Pourreza, Roland Memisevic, Hao Su
Large Language Models (LLMs) have achieved tremendous progress, yet they still often struggle with challenging reasoning problems.
no code implementations • 16 Aug 2023 • Reza Pourreza, Apratim Bhattacharyya, Sunny Panchal, Mingu Lee, Pulkit Madan, Roland Memisevic
In this work, we apply LLMs to image generation tasks by directly generating the virtual brush strokes to paint an image.
no code implementations • 30 Jun 2023 • Apratim Bhattacharyya, Sunny Panchal, Mingu Lee, Reza Pourreza, Pulkit Madan, Roland Memisevic
Multi-modal language models (LM) have recently shown promising performance in high-level reasoning tasks on videos.
1 code implementation • NeurIPS 2023 • Zhan Ling, Yunhao Fang, Xuanlin Li, Zhiao Huang, Mingu Lee, Roland Memisevic, Hao Su
In light of this, we propose to decompose a reasoning verification process into a series of step-by-step subprocesses, each only receiving their necessary context and premises.
no code implementations • 14 May 2023 • Antoine Mercier, Guillaume Berger, Sunny Panchal, Florian Letsch, Cornelius Boehm, Nahua Kang, Ingo Bax, Roland Memisevic
End-to-end learning has taken hold of many computer vision tasks, in particular, related to still images, with task-specific optimization yielding very strong performance.
no code implementations • 11 Nov 2022 • Roland Memisevic
In this essay we relate parameter sharing (``weight sharing'') to analogy making and the school of thought of cognitive metaphor.
1 code implementation • 24 Apr 2018 • Farzaneh Mahdisoltani, Guillaume Berger, Waseem Gharbieh, David Fleet, Roland Memisevic
We describe a DNN for video classification and captioning, trained end-to-end, with shared features, to solve tasks at different levels of granularity, exploring the link between granularity in a source task and the quality of learned features for transfer learning.
5 code implementations • ICCV 2017 • Raghav Goyal, Samira Ebrahimi Kahou, Vincent Michalski, Joanna Materzyńska, Susanne Westphal, Heuna Kim, Valentin Haenel, Ingo Fruend, Peter Yianilos, Moritz Mueller-Freitag, Florian Hoppe, Christian Thurau, Ingo Bax, Roland Memisevic
Neural networks trained on datasets such as ImageNet have led to major advances in visual object classification.
Ranked #115 on Action Recognition on Something-Something V2
1 code implementation • 9 May 2016 • The Theano Development Team, Rami Al-Rfou, Guillaume Alain, Amjad Almahairi, Christof Angermueller, Dzmitry Bahdanau, Nicolas Ballas, Frédéric Bastien, Justin Bayer, Anatoly Belikov, Alexander Belopolsky, Yoshua Bengio, Arnaud Bergeron, James Bergstra, Valentin Bisson, Josh Bleecher Snyder, Nicolas Bouchard, Nicolas Boulanger-Lewandowski, Xavier Bouthillier, Alexandre de Brébisson, Olivier Breuleux, Pierre-Luc Carrier, Kyunghyun Cho, Jan Chorowski, Paul Christiano, Tim Cooijmans, Marc-Alexandre Côté, Myriam Côté, Aaron Courville, Yann N. Dauphin, Olivier Delalleau, Julien Demouth, Guillaume Desjardins, Sander Dieleman, Laurent Dinh, Mélanie Ducoffe, Vincent Dumoulin, Samira Ebrahimi Kahou, Dumitru Erhan, Ziye Fan, Orhan Firat, Mathieu Germain, Xavier Glorot, Ian Goodfellow, Matt Graham, Caglar Gulcehre, Philippe Hamel, Iban Harlouchet, Jean-Philippe Heng, Balázs Hidasi, Sina Honari, Arjun Jain, Sébastien Jean, Kai Jia, Mikhail Korobov, Vivek Kulkarni, Alex Lamb, Pascal Lamblin, Eric Larsen, César Laurent, Sean Lee, Simon Lefrancois, Simon Lemieux, Nicholas Léonard, Zhouhan Lin, Jesse A. Livezey, Cory Lorenz, Jeremiah Lowin, Qianli Ma, Pierre-Antoine Manzagol, Olivier Mastropietro, Robert T. McGibbon, Roland Memisevic, Bart van Merriënboer, Vincent Michalski, Mehdi Mirza, Alberto Orlandi, Christopher Pal, Razvan Pascanu, Mohammad Pezeshki, Colin Raffel, Daniel Renshaw, Matthew Rocklin, Adriana Romero, Markus Roth, Peter Sadowski, John Salvatier, François Savard, Jan Schlüter, John Schulman, Gabriel Schwartz, Iulian Vlad Serban, Dmitriy Serdyuk, Samira Shabanian, Étienne Simon, Sigurd Spieckermann, S. Ramana Subramanyam, Jakub Sygnowski, Jérémie Tanguay, Gijs van Tulder, Joseph Turian, Sebastian Urban, Pascal Vincent, Francesco Visin, Harm de Vries, David Warde-Farley, Dustin J. Webb, Matthew Willson, Kelvin Xu, Lijun Xue, Li Yao, Saizheng Zhang, Ying Zhang
Since its introduction, it has been one of the most used CPU and GPU mathematical compilers - especially in the machine learning community - and has shown steady performance improvements.
no code implementations • NeurIPS 2016 • Saizheng Zhang, Yuhuai Wu, Tong Che, Zhouhan Lin, Roland Memisevic, Ruslan Salakhutdinov, Yoshua Bengio
In this paper, we systematically analyze the connecting architectures of recurrent neural networks (RNNs).
Ranked #23 on Language Modelling on Text8
1 code implementation • 16 Feb 2016 • Daniel Jiwoong Im, Chris Dongjoo Kim, Hui Jiang, Roland Memisevic
Gatys et al. (2015) showed that optimizing pixels to match features in a convolutional network with respect reference image features is a way to render images of high visual quality.
1 code implementation • 26 Nov 2015 • David Krueger, Roland Memisevic
We stabilize the activations of Recurrent Neural Networks (RNNs) by penalizing the squared distance between successive hidden states' norms.
no code implementations • 19 Nov 2015 • Daniel Jiwoong Im, Sungjin Ahn, Roland Memisevic, Yoshua Bengio
Denoising autoencoders (DAE) are trained to reconstruct their clean inputs with noise injected at the input level, while variational autoencoders (VAE) are trained with noise injected in their stochastic hidden layer, with a regularizer that encourages this noise injection.
5 code implementations • 9 Nov 2015 • Zhouhan Lin, Roland Memisevic, Kishore Konda
We propose ways to improve the performance of fully connected networks.
no code implementations • 29 Oct 2015 • Samira Ebrahimi Kahou, Vincent Michalski, Roland Memisevic
The proposed Recurrent Attentive Tracking Model performs well on all three tasks and can generalize to related but previously unseen sequences from a challenging tracking data set.
2 code implementations • 11 Oct 2015 • Zhouhan Lin, Matthieu Courbariaux, Roland Memisevic, Yoshua Bengio
For most deep learning algorithms training is notoriously time consuming.
no code implementations • 29 Jun 2015 • Xavier Bouthillier, Kishore Konda, Pascal Vincent, Roland Memisevic
Dropout is typically interpreted as bagging a large number of models sharing parameters.
no code implementations • 25 Jun 2015 • Daniel Jiwoong Im, Mohamed Ishmael Diwan Belghazi, Roland Memisevic
We discuss necessary and sufficient conditions for an auto-encoder to define a conservative vector field, in which case it is associated with an energy function akin to the unnormalized log-probability of the data.
no code implementations • 5 Mar 2015 • Samira Ebrahimi Kahou, Xavier Bouthillier, Pascal Lamblin, Caglar Gulcehre, Vincent Michalski, Kishore Konda, Sébastien Jean, Pierre Froumenty, Yann Dauphin, Nicolas Boulanger-Lewandowski, Raul Chandias Ferrari, Mehdi Mirza, David Warde-Farley, Aaron Courville, Pascal Vincent, Roland Memisevic, Christopher Pal, Yoshua Bengio
The task of the emotion recognition in the wild (EmotiW) Challenge is to assign one of seven emotions to short video clips extracted from Hollywood style movies.
1 code implementation • IJCNLP 2015 • Sébastien Jean, Kyunghyun Cho, Roland Memisevic, Yoshua Bengio
The models trained by the proposed approach are empirically found to outperform the baseline models with a small vocabulary as well as the LSTM-based neural machine translation models.
no code implementations • NeurIPS 2014 • Vincent Michalski, Roland Memisevic, Kishore Konda
We propose modeling time series by representing the transformations that take a frame at time t to a frame at time t+1.
no code implementations • 13 Feb 2014 • Kishore Konda, Roland Memisevic, David Krueger
We show that negative biases are a natural result of using a hidden layer whose responsibility is to both represent the input data and act as a selection mechanism that ensures sparsity of the representation.
no code implementations • 10 Feb 2014 • Vincent Michalski, Roland Memisevic, Kishore Konda
In this work we extend bi-linear models by introducing "higher-order mapping units" that allow us to encode transformations between frames and transformations between transformations.
no code implementations • 12 Dec 2013 • Kishore Konda, Roland Memisevic
We present a model for the joint estimation of disparity and motion.
no code implementations • 13 Jun 2013 • Kishore Reddy Konda, Roland Memisevic, Vincent Michalski
To this end, we show that the detection of spatial transformations can be viewed as the detection of synchrony between the image sequence and a sequence of features undergoing the motion we wish to detect.
no code implementations • NeurIPS 2010 • Roland Memisevic, Christopher Zach, Marc Pollefeys, Geoffrey E. Hinton
We describe a log-bilinear" model that computes class probabilities by combining an input vector multiplicatively with a vector of binary latent variables.