Search Results for author: Mateusz Malinowski

Found 44 papers, 14 papers with code

Learnable Pooling Regions for Image Classification

no code implementations • 15 Jan 2013 • Mateusz Malinowski, Mario Fritz

Biologically inspired, from the early HMAX model to Spatial Pyramid Matching, pooling has played an important role in visual recognition pipelines.

Classification General Classification +2

Paper
Add Code

A Multi-World Approach to Question Answering about Real-World Scenes based on Uncertain Input

no code implementations • NeurIPS 2014 • Mateusz Malinowski, Mario Fritz

We propose a method for automatically answering questions about images by bringing together recent advances from natural language processing and computer vision.

Question Answering

Paper
Add Code

Towards a Visual Turing Challenge

no code implementations • 29 Oct 2014 • Mateusz Malinowski, Mario Fritz

As language and visual understanding by machines progresses rapidly, we are observing an increasing interest in holistic architectures that tightly interlink both modalities in a joint learning and inference process.

Question Answering

Paper
Add Code

A Pooling Approach to Modelling Spatial Relations for Image Retrieval and Annotation

no code implementations • 19 Nov 2014 • Mateusz Malinowski, Mario Fritz

Over the last two decades we have witnessed strong progress on modeling visual object classes, scenes and attributes that have significantly contributed to automated image understanding.

Image Retrieval Retrieval

Paper
Add Code

Hard to Cheat: A Turing Test based on Answering Questions about Images

no code implementations • 14 Jan 2015 • Mateusz Malinowski, Mario Fritz

Progress in language and image understanding by machines has sparkled the interest of the research community in more open-ended, holistic tasks, and refueled an old AI dream of building intelligent machines.

Question Answering

Paper
Add Code

Ask Your Neurons: A Neural-based Approach to Answering Questions about Images

no code implementations • ICCV 2015 • Mateusz Malinowski, Marcus Rohrbach, Mario Fritz

In contrast to previous efforts, we are facing a multi-modal problem where the language output (answer) is conditioned on visual and natural language input (image and question).

Question Answering

Paper
Add Code

Contextual Media Retrieval Using Natural Language Queries

no code implementations • 16 Feb 2016 • Sreyasi Nag Chowdhury, Mateusz Malinowski, Andreas Bulling, Mario Fritz

We show that our retrieval system can cope with this variability using personalisation through an online learning-based retrieval formulation.

Natural Language Queries Retrieval

Paper
Add Code

Multi-Cue Zero-Shot Learning with Strong Supervision

no code implementations • CVPR 2016 • Zeynep Akata, Mateusz Malinowski, Mario Fritz, Bernt Schiele

A promising research direction is zero-shot learning, which does not require any training data to recognize new classes, but rather relies on some form of auxiliary information describing the new classes.

Attribute Retrieval +1

Paper
Add Code

Ask Your Neurons: A Deep Learning Approach to Visual Question Answering

1 code implementation • 9 May 2016 • Mateusz Malinowski, Marcus Rohrbach, Mario Fritz

By combining latest advances in image representation and natural language processing, we propose Ask Your Neurons, a scalable, jointly trained, end-to-end formulation to this problem.

Question Answering Visual Question Answering

117

Paper
Code

Spatio-Temporal Image Boundary Extrapolation

no code implementations • 24 May 2016 • Apratim Bhattacharyya, Mateusz Malinowski, Mario Fritz

Furthermore, we show long-term prediction of boundaries in situations where the motion is governed by the laws of physics.

Video Segmentation Video Semantic Segmentation

Paper
Add Code

Mean Box Pooling: A Rich Image Representation and Output Embedding for the Visual Madlibs Task

no code implementations • 9 Aug 2016 • Ashkan Mokarian, Mateusz Malinowski, Mario Fritz

We present Mean Box Pooling, a novel visual representation that pools over CNN representations of a large number, highly overlapping object proposals.

Paper
Add Code

Tutorial on Answering Questions about Images with Deep Learning

1 code implementation • 4 Oct 2016 • Mateusz Malinowski, Mario Fritz

Together with the development of more accurate methods in Computer Vision and Natural Language Understanding, holistic architectures that answer on questions about the content of real-world images have emerged.

Natural Language Understanding Visual Question Answering (VQA)

117

Paper
Code

Long-Term Image Boundary Prediction

no code implementations • 27 Nov 2016 • Apratim Bhattacharyya, Mateusz Malinowski, Bernt Schiele, Mario Fritz

Boundary estimation in images and videos has been a very active topic of research, and organizing visual information into boundaries and segments is believed to be a corner stone of visual perception.

Paper
Add Code

A simple neural network module for relational reasoning

20 code implementations • NeurIPS 2017 • Adam Santoro, David Raposo, David G. T. Barrett, Mateusz Malinowski, Razvan Pascanu, Peter Battaglia, Timothy Lillicrap

Relational reasoning is a central component of generally intelligent behavior, but has proven difficult for neural networks to learn.

Ranked #5 on Image Retrieval with Multi-Modal Query on Fashion200k

Image Retrieval with Multi-Modal Query Question Answering +2

806

Paper
Code

Learning to Navigate in Cities Without a Map

4 code implementations • NeurIPS 2018 • Piotr Mirowski, Matthew Koichi Grimes, Mateusz Malinowski, Karl Moritz Hermann, Keith Anderson, Denis Teplyashin, Karen Simonyan, Koray Kavukcuoglu, Andrew Zisserman, Raia Hadsell

We present an interactive navigation environment that uses Google StreetView for its photographic content and worldwide coverage, and demonstrate that our learning method allows agents to learn to navigate multiple cities and to traverse to target destinations that may be kilometres away.

Autonomous Navigation Navigate +2

973

Paper
Code

Hyperbolic Attention Networks

no code implementations • ICLR 2019 • Caglar Gulcehre, Misha Denil, Mateusz Malinowski, Ali Razavi, Razvan Pascanu, Karl Moritz Hermann, Peter Battaglia, Victor Bapst, David Raposo, Adam Santoro, Nando de Freitas

We introduce hyperbolic attention networks to endow neural networks with enough capacity to match the complexity of data with hierarchical and power-law structure.

Machine Translation Question Answering +2

Paper
Add Code

Relational inductive biases, deep learning, and graph networks

31 code implementations • 4 Jun 2018 • Peter W. Battaglia, Jessica B. Hamrick, Victor Bapst, Alvaro Sanchez-Gonzalez, Vinicius Zambaldi, Mateusz Malinowski, Andrea Tacchetti, David Raposo, Adam Santoro, Ryan Faulkner, Caglar Gulcehre, Francis Song, Andrew Ballard, Justin Gilmer, George Dahl, Ashish Vaswani, Kelsey Allen, Charles Nash, Victoria Langston, Chris Dyer, Nicolas Heess, Daan Wierstra, Pushmeet Kohli, Matt Botvinick, Oriol Vinyals, Yujia Li, Razvan Pascanu

As a companion to this paper, we have released an open-source software library for building graph networks, with demonstrations of how to use them in practice.

Decision Making Inductive Bias +1

5,323

Paper
Code

Learning Visual Question Answering by Bootstrapping Hard Attention

no code implementations • ECCV 2018 • Mateusz Malinowski, Carl Doersch, Adam Santoro, Peter Battaglia

Attention mechanisms in biological perception are thought to select subsets of perceptual information for more sophisticated processing which would be prohibitive to perform on all sensory inputs.

Ranked #7 on Visual Question Answering (VQA) on CLEVR

Hard Attention Question Answering +1

Paper
Add Code

The Visual QA Devil in the Details: The Impact of Early Fusion and Batch Norm on CLEVR

no code implementations • 11 Sep 2018 • Mateusz Malinowski, Carl Doersch

Visual QA is a pivotal challenge for higher-level reasoning, requiring understanding language, vision, and relationships between many objects in a scene.

Question Answering Relational Reasoning

Paper
Add Code

Playing the Game of Universal Adversarial Perturbations

no code implementations • 20 Sep 2018 • Julien Perolat, Mateusz Malinowski, Bilal Piot, Olivier Pietquin

We study the problem of learning classifiers robust to universal adversarial perturbations.

Classification General Classification +1

Paper
Add Code

Generating Diverse Programs with Instruction Conditioned Reinforced Adversarial Learning

no code implementations • 3 Dec 2018 • Aishwarya Agrawal, Mateusz Malinowski, Felix Hill, Ali Eslami, Oriol Vinyals, tejas kulkarni

In this work, we study the setting in which an agent must learn to generate programs for diverse scenes conditioned on a given symbolic instruction.

Paper
Add Code

Learning To Follow Directions in Street View

1 code implementation • 1 Mar 2019 • Karl Moritz Hermann, Mateusz Malinowski, Piotr Mirowski, Andras Banki-Horvath, Keith Anderson, Raia Hadsell

Navigating and understanding the real world remains a key challenge in machine learning and inspires a great variety of research in areas such as language grounding, planning, navigation and computer vision.

Instruction Following Navigate +1

281

Paper
Code

The StreetLearn Environment and Dataset

1 code implementation • 4 Mar 2019 • Piotr Mirowski, Andras Banki-Horvath, Keith Anderson, Denis Teplyashin, Karl Moritz Hermann, Mateusz Malinowski, Matthew Koichi Grimes, Karen Simonyan, Koray Kavukcuoglu, Andrew Zisserman, Raia Hadsell

These datasets cannot be used for decision-making and reinforcement learning, however, and in general the perspective of navigation as an interactive learning task, where the actions and behaviours of a learning agent are learned simultaneously with the perception and planning, is relatively unsupported.

Decision Making

281

Paper
Code

Learning dynamic polynomial proofs

no code implementations • NeurIPS 2019 • Alhussein Fawzi, Mateusz Malinowski, Hamza Fawzi, Omar Fawzi

In this work, we introduce a machine learning based method to search for a dynamic proof within these proof systems.

BIG-bench Machine Learning Inductive Bias

Paper
Add Code

Sideways: Depth-Parallel Training of Video Models

no code implementations • CVPR 2020 • Mateusz Malinowski, Grzegorz Swirszcz, Joao Carreira, Viorica Patraucean

We propose Sideways, an approximate backpropagation scheme for training video models.

Paper
Add Code

Visual Grounding in Video for Unsupervised Word Translation

1 code implementation • CVPR 2020 • Gunnar A. Sigurdsson, Jean-Baptiste Alayrac, Aida Nematzadeh, Lucas Smaira, Mateusz Malinowski, João Carreira, Phil Blunsom, Andrew Zisserman

Given this shared embedding we demonstrate that (i) we can map words between the languages, particularly the 'visual' words; (ii) that the shared embedding provides a good initialization for existing unsupervised text-based word translation techniques, forming the basis for our proposed hybrid visual-text mapping algorithm, MUVE; and (iii) our approach achieves superior performance by addressing the shortcomings of text-based methods -- it is more robust, handles datasets with less commonality, and is applicable to low-resource languages.

Translation Visual Grounding +1

Paper
Code

IReEn: Reverse-Engineering of Black-Box Functions via Iterative Neural Program Synthesis

no code implementations • NeurIPS Workshop CAP 2020 • Hossein Hajipour, Mateusz Malinowski, Mario Fritz

In this work, we investigate the problem of revealing the functionality of a black-box agent.

Computer Security Program Synthesis

Paper
Add Code

Broaden Your Views for Self-Supervised Video Learning

1 code implementation • ICCV 2021 • Adrià Recasens, Pauline Luc, Jean-Baptiste Alayrac, Luyu Wang, Ross Hemsley, Florian Strub, Corentin Tallec, Mateusz Malinowski, Viorica Patraucean, Florent Altché, Michal Valko, Jean-bastien Grill, Aäron van den Oord, Andrew Zisserman

Most successful self-supervised learning methods are trained to align the representations of two independent views from the data.

Ranked #1 on Self-Supervised Action Recognition on HMDB51 (finetuned)

Audio Classification Optical Flow Estimation +4

Paper
Code

Measuring and Improving BERT's Mathematical Abilities by Predicting the Order of Reasoning

no code implementations • 7 Jun 2021 • Piotr Piękos, Henryk Michalewski, Mateusz Malinowski

How many fruits do you have in total?

Language Modelling Math

Paper
Add Code

Gradient Forward-Propagation for Large-Scale Temporal Video Modelling

no code implementations • CVPR 2021 • Mateusz Malinowski, Dimitrios Vytiniotis, Grzegorz Swirszcz, Viorica Patraucean, Joao Carreira

How can neural networks be trained on large-volume temporal data efficiently?

Action Recognition Blocking

Paper
Add Code

Learning Altruistic Behaviours in Reinforcement Learning without External Rewards

no code implementations • ICLR 2022 • Tim Franzmeyer, Mateusz Malinowski, João F. Henriques

Such an approach assumes that other agents' goals are known so that the altruistic agent can cooperate in achieving those goals.

reinforcement-learning Reinforcement Learning (RL)

Paper
Add Code

Measuring and Improving BERT's Mathematical Abilities by Predicting the Order of Reasoning.

no code implementations • ACL 2021 • Piotr Pi{\k{e}}kos, Mateusz Malinowski, Henryk Michalewski

How many fruits do you have in total?

Language Modelling Math

Paper
Add Code

Measuring CLEVRness: Black-box Testing of Visual Reasoning Models

no code implementations • ICLR 2022 • Spyridon Mouselinos, Henryk Michalewski, Mateusz Malinowski

To answer such a question, we extend the visual question answering framework and propose the following behavioral test in the form of a two-player game.

Benchmarking Question Answering +2

Paper
Add Code

General-purpose, long-context autoregressive modeling with Perceiver AR

3 code implementations • 15 Feb 2022 • Curtis Hawthorne, Andrew Jaegle, Cătălina Cangea, Sebastian Borgeaud, Charlie Nash, Mateusz Malinowski, Sander Dieleman, Oriol Vinyals, Matthew Botvinick, Ian Simon, Hannah Sheahan, Neil Zeghidour, Jean-Baptiste Alayrac, João Carreira, Jesse Engel

Real-world data is high-dimensional: a book, image, or musical performance can easily contain hundreds of thousands of elements even after compression.

Ranked #35 on Language Modelling on WikiText-103

Density Estimation Language Modelling

404

Paper
Code

Measuring CLEVRness: Blackbox testing of Visual Reasoning Models

no code implementations • 24 Feb 2022 • Spyridon Mouselinos, Henryk Michalewski, Mateusz Malinowski

Visual question answering provides a convenient framework for testing the model's abilities by interrogating the model through questions about the scene.

Benchmarking Question Answering +2

Paper
Add Code

Transframer: Arbitrary Frame Prediction with Generative Models

no code implementations • 17 Mar 2022 • Charlie Nash, João Carreira, Jacob Walker, Iain Barr, Andrew Jaegle, Mateusz Malinowski, Peter Battaglia

We present a general-purpose framework for image modelling and vision tasks based on probabilistic frame prediction.

Image Classification Image Segmentation +4

Paper
Add Code

CLIP-CLOP: CLIP-Guided Collage and Photomontage

1 code implementation • 6 May 2022 • Piotr Mirowski, Dylan Banarse, Mateusz Malinowski, Simon Osindero, Chrisantha Fernando

The unabated mystique of large-scale neural networks, such as the CLIP dual image-and-text encoder, popularized automatically generated art.

Prompt Engineering

236

Paper
Code

Neural Payoff Machines: Predicting Fair and Stable Payoff Allocations Among Team Members

no code implementations • 18 Aug 2022 • Daphne Cornelisse, Thomas Rood, Mateusz Malinowski, Yoram Bachrach, Tal Kachman

Cooperative game theory offers solution concepts identifying distribution schemes, such as the Shapley value, that fairly reflect the contribution of individuals to the performance of the team or the Core, which reduces the incentive of agents to abandon their team.

Paper
Add Code

Compressed Vision for Efficient Video Understanding

no code implementations • 6 Oct 2022 • Olivia Wiles, Joao Carreira, Iain Barr, Andrew Zisserman, Mateusz Malinowski

In this work, we propose a framework enabling research on hour-long videos with the same hardware that can now process second-long videos.

Video Compression Video Understanding

Paper
Add Code

Perception Test: A Diagnostic Benchmark for Multimodal Models

1 code implementation • Deep Mind 2022 • Viorica Pătrăucean, Lucas Smaira, Ankush Gupta, Adrià Recasens Continente, Larisa Markeeva, Dylan Banarse, Mateusz Malinowski, Yi Yang, Carl Doersch, Tatiana Matejovicova, Yury Sulsky, Antoine Miech, Skanda Koppula, Alex Frechette, Hanna Klimczak, Raphael Koster, Junlin Zhang, Stephanie Winkler, Yusuf Aytar, Simon Osindero, Dima Damen, Andrew Zisserman and João Carreira

We propose a novel multimodal benchmark – the Perception Test – that aims to extensively evaluate perception and reasoning skills of multimodal models.

Multiple-choice Question Answering +1

151

Paper
Code

A Simple, Yet Effective Approach to Finding Biases in Code Generation

no code implementations • 31 Oct 2022 • Spyridon Mouselinos, Mateusz Malinowski, Henryk Michalewski

This work shows that current code generation systems exhibit undesired biases inherited from their large language model backbones, which can reduce the quality of the generated code under specific circumstances.

Causal Language Modeling Code Generation +2

Paper
Add Code

Perception Test: A Diagnostic Benchmark for Multimodal Video Models

2 code implementations • NeurIPS 2023 • Viorica Pătrăucean, Lucas Smaira, Ankush Gupta, Adrià Recasens Continente, Larisa Markeeva, Dylan Banarse, Skanda Koppula, Joseph Heyward, Mateusz Malinowski, Yi Yang, Carl Doersch, Tatiana Matejovicova, Yury Sulsky, Antoine Miech, Alex Frechette, Hanna Klimczak, Raphael Koster, Junlin Zhang, Stephanie Winkler, Yusuf Aytar, Simon Osindero, Dima Damen, Andrew Zisserman, João Carreira

We propose a novel multimodal video benchmark - the Perception Test - to evaluate the perception and reasoning skills of pre-trained multimodal models (e. g. Flamingo, SeViLA, or GPT-4).

counterfactual Descriptive +2

151

Paper
Code

SODA: Bottleneck Diffusion Models for Representation Learning

1 code implementation • 29 Nov 2023 • Drew A. Hudson, Daniel Zoran, Mateusz Malinowski, Andrew K. Lampinen, Andrew Jaegle, James L. McClelland, Loic Matthey, Felix Hill, Alexander Lerchner

We introduce SODA, a self-supervised diffusion model, designed for representation learning.

Denoising Image Generation +3

Paper
Code

Beyond Lines and Circles: Unveiling the Geometric Reasoning Gap in Large Language Models

no code implementations • 6 Feb 2024 • Spyridon Mouselinos, Henryk Michalewski, Mateusz Malinowski

Large Language Models (LLMs) demonstrate ever-increasing abilities in mathematical and algorithmic tasks, yet their geometric reasoning skills are underexplored.

Mathematical Reasoning Variable Selection

Paper
Add Code

Cannot find the paper you are looking for? You can Submit a new open access paper.