Search Results for author: Mateusz Malinowski

Found 44 papers, 14 papers with code

Learnable Pooling Regions for Image Classification

no code implementations15 Jan 2013 Mateusz Malinowski, Mario Fritz

Biologically inspired, from the early HMAX model to Spatial Pyramid Matching, pooling has played an important role in visual recognition pipelines.

Classification General Classification +2

A Multi-World Approach to Question Answering about Real-World Scenes based on Uncertain Input

no code implementations NeurIPS 2014 Mateusz Malinowski, Mario Fritz

We propose a method for automatically answering questions about images by bringing together recent advances from natural language processing and computer vision.

Question Answering

Towards a Visual Turing Challenge

no code implementations29 Oct 2014 Mateusz Malinowski, Mario Fritz

As language and visual understanding by machines progresses rapidly, we are observing an increasing interest in holistic architectures that tightly interlink both modalities in a joint learning and inference process.

Question Answering

A Pooling Approach to Modelling Spatial Relations for Image Retrieval and Annotation

no code implementations19 Nov 2014 Mateusz Malinowski, Mario Fritz

Over the last two decades we have witnessed strong progress on modeling visual object classes, scenes and attributes that have significantly contributed to automated image understanding.

Image Retrieval Retrieval

Hard to Cheat: A Turing Test based on Answering Questions about Images

no code implementations14 Jan 2015 Mateusz Malinowski, Mario Fritz

Progress in language and image understanding by machines has sparkled the interest of the research community in more open-ended, holistic tasks, and refueled an old AI dream of building intelligent machines.

Question Answering

Ask Your Neurons: A Neural-based Approach to Answering Questions about Images

no code implementations ICCV 2015 Mateusz Malinowski, Marcus Rohrbach, Mario Fritz

In contrast to previous efforts, we are facing a multi-modal problem where the language output (answer) is conditioned on visual and natural language input (image and question).

Question Answering

Contextual Media Retrieval Using Natural Language Queries

no code implementations16 Feb 2016 Sreyasi Nag Chowdhury, Mateusz Malinowski, Andreas Bulling, Mario Fritz

We show that our retrieval system can cope with this variability using personalisation through an online learning-based retrieval formulation.

Natural Language Queries Retrieval

Multi-Cue Zero-Shot Learning with Strong Supervision

no code implementations CVPR 2016 Zeynep Akata, Mateusz Malinowski, Mario Fritz, Bernt Schiele

A promising research direction is zero-shot learning, which does not require any training data to recognize new classes, but rather relies on some form of auxiliary information describing the new classes.

Attribute Retrieval +1

Ask Your Neurons: A Deep Learning Approach to Visual Question Answering

1 code implementation9 May 2016 Mateusz Malinowski, Marcus Rohrbach, Mario Fritz

By combining latest advances in image representation and natural language processing, we propose Ask Your Neurons, a scalable, jointly trained, end-to-end formulation to this problem.

Question Answering Visual Question Answering

Spatio-Temporal Image Boundary Extrapolation

no code implementations24 May 2016 Apratim Bhattacharyya, Mateusz Malinowski, Mario Fritz

Furthermore, we show long-term prediction of boundaries in situations where the motion is governed by the laws of physics.

Video Segmentation Video Semantic Segmentation

Mean Box Pooling: A Rich Image Representation and Output Embedding for the Visual Madlibs Task

no code implementations9 Aug 2016 Ashkan Mokarian, Mateusz Malinowski, Mario Fritz

We present Mean Box Pooling, a novel visual representation that pools over CNN representations of a large number, highly overlapping object proposals.

Tutorial on Answering Questions about Images with Deep Learning

1 code implementation4 Oct 2016 Mateusz Malinowski, Mario Fritz

Together with the development of more accurate methods in Computer Vision and Natural Language Understanding, holistic architectures that answer on questions about the content of real-world images have emerged.

Natural Language Understanding Visual Question Answering (VQA)

Long-Term Image Boundary Prediction

no code implementations27 Nov 2016 Apratim Bhattacharyya, Mateusz Malinowski, Bernt Schiele, Mario Fritz

Boundary estimation in images and videos has been a very active topic of research, and organizing visual information into boundaries and segments is believed to be a corner stone of visual perception.

Learning to Navigate in Cities Without a Map

4 code implementations NeurIPS 2018 Piotr Mirowski, Matthew Koichi Grimes, Mateusz Malinowski, Karl Moritz Hermann, Keith Anderson, Denis Teplyashin, Karen Simonyan, Koray Kavukcuoglu, Andrew Zisserman, Raia Hadsell

We present an interactive navigation environment that uses Google StreetView for its photographic content and worldwide coverage, and demonstrate that our learning method allows agents to learn to navigate multiple cities and to traverse to target destinations that may be kilometres away.

Autonomous Navigation Navigate +2

Hyperbolic Attention Networks

no code implementations ICLR 2019 Caglar Gulcehre, Misha Denil, Mateusz Malinowski, Ali Razavi, Razvan Pascanu, Karl Moritz Hermann, Peter Battaglia, Victor Bapst, David Raposo, Adam Santoro, Nando de Freitas

We introduce hyperbolic attention networks to endow neural networks with enough capacity to match the complexity of data with hierarchical and power-law structure.

Machine Translation Question Answering +2

Learning Visual Question Answering by Bootstrapping Hard Attention

no code implementations ECCV 2018 Mateusz Malinowski, Carl Doersch, Adam Santoro, Peter Battaglia

Attention mechanisms in biological perception are thought to select subsets of perceptual information for more sophisticated processing which would be prohibitive to perform on all sensory inputs.

Hard Attention Question Answering +1

The Visual QA Devil in the Details: The Impact of Early Fusion and Batch Norm on CLEVR

no code implementations11 Sep 2018 Mateusz Malinowski, Carl Doersch

Visual QA is a pivotal challenge for higher-level reasoning, requiring understanding language, vision, and relationships between many objects in a scene.

Question Answering Relational Reasoning

Generating Diverse Programs with Instruction Conditioned Reinforced Adversarial Learning

no code implementations3 Dec 2018 Aishwarya Agrawal, Mateusz Malinowski, Felix Hill, Ali Eslami, Oriol Vinyals, tejas kulkarni

In this work, we study the setting in which an agent must learn to generate programs for diverse scenes conditioned on a given symbolic instruction.

Learning To Follow Directions in Street View

1 code implementation1 Mar 2019 Karl Moritz Hermann, Mateusz Malinowski, Piotr Mirowski, Andras Banki-Horvath, Keith Anderson, Raia Hadsell

Navigating and understanding the real world remains a key challenge in machine learning and inspires a great variety of research in areas such as language grounding, planning, navigation and computer vision.

Instruction Following Navigate +1

The StreetLearn Environment and Dataset

1 code implementation4 Mar 2019 Piotr Mirowski, Andras Banki-Horvath, Keith Anderson, Denis Teplyashin, Karl Moritz Hermann, Mateusz Malinowski, Matthew Koichi Grimes, Karen Simonyan, Koray Kavukcuoglu, Andrew Zisserman, Raia Hadsell

These datasets cannot be used for decision-making and reinforcement learning, however, and in general the perspective of navigation as an interactive learning task, where the actions and behaviours of a learning agent are learned simultaneously with the perception and planning, is relatively unsupported.

Decision Making

Learning dynamic polynomial proofs

no code implementations NeurIPS 2019 Alhussein Fawzi, Mateusz Malinowski, Hamza Fawzi, Omar Fawzi

In this work, we introduce a machine learning based method to search for a dynamic proof within these proof systems.

BIG-bench Machine Learning Inductive Bias

Visual Grounding in Video for Unsupervised Word Translation

1 code implementation CVPR 2020 Gunnar A. Sigurdsson, Jean-Baptiste Alayrac, Aida Nematzadeh, Lucas Smaira, Mateusz Malinowski, João Carreira, Phil Blunsom, Andrew Zisserman

Given this shared embedding we demonstrate that (i) we can map words between the languages, particularly the 'visual' words; (ii) that the shared embedding provides a good initialization for existing unsupervised text-based word translation techniques, forming the basis for our proposed hybrid visual-text mapping algorithm, MUVE; and (iii) our approach achieves superior performance by addressing the shortcomings of text-based methods -- it is more robust, handles datasets with less commonality, and is applicable to low-resource languages.

Translation Visual Grounding +1

Measuring CLEVRness: Black-box Testing of Visual Reasoning Models

no code implementations ICLR 2022 Spyridon Mouselinos, Henryk Michalewski, Mateusz Malinowski

To answer such a question, we extend the visual question answering framework and propose the following behavioral test in the form of a two-player game.

Benchmarking Question Answering +2

Measuring CLEVRness: Blackbox testing of Visual Reasoning Models

no code implementations24 Feb 2022 Spyridon Mouselinos, Henryk Michalewski, Mateusz Malinowski

Visual question answering provides a convenient framework for testing the model's abilities by interrogating the model through questions about the scene.

Benchmarking Question Answering +2

CLIP-CLOP: CLIP-Guided Collage and Photomontage

1 code implementation6 May 2022 Piotr Mirowski, Dylan Banarse, Mateusz Malinowski, Simon Osindero, Chrisantha Fernando

The unabated mystique of large-scale neural networks, such as the CLIP dual image-and-text encoder, popularized automatically generated art.

Prompt Engineering

Neural Payoff Machines: Predicting Fair and Stable Payoff Allocations Among Team Members

no code implementations18 Aug 2022 Daphne Cornelisse, Thomas Rood, Mateusz Malinowski, Yoram Bachrach, Tal Kachman

Cooperative game theory offers solution concepts identifying distribution schemes, such as the Shapley value, that fairly reflect the contribution of individuals to the performance of the team or the Core, which reduces the incentive of agents to abandon their team.

Compressed Vision for Efficient Video Understanding

no code implementations6 Oct 2022 Olivia Wiles, Joao Carreira, Iain Barr, Andrew Zisserman, Mateusz Malinowski

In this work, we propose a framework enabling research on hour-long videos with the same hardware that can now process second-long videos.

Video Compression Video Understanding

A Simple, Yet Effective Approach to Finding Biases in Code Generation

no code implementations31 Oct 2022 Spyridon Mouselinos, Mateusz Malinowski, Henryk Michalewski

This work shows that current code generation systems exhibit undesired biases inherited from their large language model backbones, which can reduce the quality of the generated code under specific circumstances.

Causal Language Modeling Code Generation +2

Beyond Lines and Circles: Unveiling the Geometric Reasoning Gap in Large Language Models

no code implementations6 Feb 2024 Spyridon Mouselinos, Henryk Michalewski, Mateusz Malinowski

Large Language Models (LLMs) demonstrate ever-increasing abilities in mathematical and algorithmic tasks, yet their geometric reasoning skills are underexplored.

Mathematical Reasoning Variable Selection

Cannot find the paper you are looking for? You can Submit a new open access paper.