Search Results for author: Mateusz Malinowski

Found 37 papers, 11 papers with code

CLIP-CLOP: CLIP-Guided Collage and Photomontage

1 code implementation6 May 2022 Piotr Mirowski, Dylan Banarse, Mateusz Malinowski, Simon Osindero, Chrisantha Fernando

The unabated mystique of large-scale neural networks, such as the CLIP dual image-and-text encoder, popularized automatically generated art.

Prompt Engineering

Measuring CLEVRness: Blackbox testing of Visual Reasoning Models

no code implementations24 Feb 2022 Spyridon Mouselinos, Henryk Michalewski, Mateusz Malinowski

Visual question answering provides a convenient framework for testing the model's abilities by interrogating the model through questions about the scene.

Question Answering Visual Question Answering +1

Measuring CLEVRness: Black-box Testing of Visual Reasoning Models

no code implementations ICLR 2022 Spyridon Mouselinos, Henryk Michalewski, Mateusz Malinowski

To answer such a question, we extend the visual question answering framework and propose the following behavioral test in the form of a two-player game.

Question Answering Visual Question Answering +1

Learning Altruistic Behaviours in Reinforcement Learning without External Rewards

no code implementations ICLR 2022 Tim Franzmeyer, Mateusz Malinowski, João F. Henriques

Such an approach assumes that other agents' goals are known so that the altruistic agent can cooperate in achieving those goals.


Visual Grounding in Video for Unsupervised Word Translation

1 code implementation CVPR 2020 Gunnar A. Sigurdsson, Jean-Baptiste Alayrac, Aida Nematzadeh, Lucas Smaira, Mateusz Malinowski, João Carreira, Phil Blunsom, Andrew Zisserman

Given this shared embedding we demonstrate that (i) we can map words between the languages, particularly the 'visual' words; (ii) that the shared embedding provides a good initialization for existing unsupervised text-based word translation techniques, forming the basis for our proposed hybrid visual-text mapping algorithm, MUVE; and (iii) our approach achieves superior performance by addressing the shortcomings of text-based methods -- it is more robust, handles datasets with less commonality, and is applicable to low-resource languages.

Translation Visual Grounding +1

Learning dynamic polynomial proofs

no code implementations NeurIPS 2019 Alhussein Fawzi, Mateusz Malinowski, Hamza Fawzi, Omar Fawzi

In this work, we introduce a machine learning based method to search for a dynamic proof within these proof systems.

Inductive Bias

The StreetLearn Environment and Dataset

1 code implementation4 Mar 2019 Piotr Mirowski, Andras Banki-Horvath, Keith Anderson, Denis Teplyashin, Karl Moritz Hermann, Mateusz Malinowski, Matthew Koichi Grimes, Karen Simonyan, Koray Kavukcuoglu, Andrew Zisserman, Raia Hadsell

These datasets cannot be used for decision-making and reinforcement learning, however, and in general the perspective of navigation as an interactive learning task, where the actions and behaviours of a learning agent are learned simultaneously with the perception and planning, is relatively unsupported.

Decision Making

Learning To Follow Directions in Street View

1 code implementation1 Mar 2019 Karl Moritz Hermann, Mateusz Malinowski, Piotr Mirowski, Andras Banki-Horvath, Keith Anderson, Raia Hadsell

Navigating and understanding the real world remains a key challenge in machine learning and inspires a great variety of research in areas such as language grounding, planning, navigation and computer vision.

Computer Vision

Generating Diverse Programs with Instruction Conditioned Reinforced Adversarial Learning

no code implementations3 Dec 2018 Aishwarya Agrawal, Mateusz Malinowski, Felix Hill, Ali Eslami, Oriol Vinyals, tejas kulkarni

In this work, we study the setting in which an agent must learn to generate programs for diverse scenes conditioned on a given symbolic instruction.

The Visual QA Devil in the Details: The Impact of Early Fusion and Batch Norm on CLEVR

no code implementations11 Sep 2018 Mateusz Malinowski, Carl Doersch

Visual QA is a pivotal challenge for higher-level reasoning, requiring understanding language, vision, and relationships between many objects in a scene.

Question Answering Relational Reasoning

Learning Visual Question Answering by Bootstrapping Hard Attention

no code implementations ECCV 2018 Mateusz Malinowski, Carl Doersch, Adam Santoro, Peter Battaglia

Attention mechanisms in biological perception are thought to select subsets of perceptual information for more sophisticated processing which would be prohibitive to perform on all sensory inputs.

Computer Vision Hard Attention +2

Hyperbolic Attention Networks

no code implementations ICLR 2019 Caglar Gulcehre, Misha Denil, Mateusz Malinowski, Ali Razavi, Razvan Pascanu, Karl Moritz Hermann, Peter Battaglia, Victor Bapst, David Raposo, Adam Santoro, Nando de Freitas

We introduce hyperbolic attention networks to endow neural networks with enough capacity to match the complexity of data with hierarchical and power-law structure.

Machine Translation Question Answering +2

Learning to Navigate in Cities Without a Map

3 code implementations NeurIPS 2018 Piotr Mirowski, Matthew Koichi Grimes, Mateusz Malinowski, Karl Moritz Hermann, Keith Anderson, Denis Teplyashin, Karen Simonyan, Koray Kavukcuoglu, Andrew Zisserman, Raia Hadsell

We present an interactive navigation environment that uses Google StreetView for its photographic content and worldwide coverage, and demonstrate that our learning method allows agents to learn to navigate multiple cities and to traverse to target destinations that may be kilometres away.

Autonomous Navigation reinforcement-learning

Long-Term Image Boundary Prediction

no code implementations27 Nov 2016 Apratim Bhattacharyya, Mateusz Malinowski, Bernt Schiele, Mario Fritz

Boundary estimation in images and videos has been a very active topic of research, and organizing visual information into boundaries and segments is believed to be a corner stone of visual perception.

Tutorial on Answering Questions about Images with Deep Learning

1 code implementation4 Oct 2016 Mateusz Malinowski, Mario Fritz

Together with the development of more accurate methods in Computer Vision and Natural Language Understanding, holistic architectures that answer on questions about the content of real-world images have emerged.

Computer Vision Natural Language Understanding +1

Mean Box Pooling: A Rich Image Representation and Output Embedding for the Visual Madlibs Task

no code implementations9 Aug 2016 Ashkan Mokarian, Mateusz Malinowski, Mario Fritz

We present Mean Box Pooling, a novel visual representation that pools over CNN representations of a large number, highly overlapping object proposals.

Spatio-Temporal Image Boundary Extrapolation

no code implementations24 May 2016 Apratim Bhattacharyya, Mateusz Malinowski, Mario Fritz

Furthermore, we show long-term prediction of boundaries in situations where the motion is governed by the laws of physics.

Video Segmentation Video Semantic Segmentation

Ask Your Neurons: A Deep Learning Approach to Visual Question Answering

1 code implementation9 May 2016 Mateusz Malinowski, Marcus Rohrbach, Mario Fritz

By combining latest advances in image representation and natural language processing, we propose Ask Your Neurons, a scalable, jointly trained, end-to-end formulation to this problem.

Natural Language Processing Question Answering +1

Multi-Cue Zero-Shot Learning with Strong Supervision

no code implementations CVPR 2016 Zeynep Akata, Mateusz Malinowski, Mario Fritz, Bernt Schiele

A promising research direction is zero-shot learning, which does not require any training data to recognize new classes, but rather relies on some form of auxiliary information describing the new classes.

Zero-Shot Learning

Contextual Media Retrieval Using Natural Language Queries

no code implementations16 Feb 2016 Sreyasi Nag Chowdhury, Mateusz Malinowski, Andreas Bulling, Mario Fritz

We show that our retrieval system can cope with this variability using personalisation through an online learning-based retrieval formulation.

Natural Language Queries online learning

Ask Your Neurons: A Neural-based Approach to Answering Questions about Images

no code implementations ICCV 2015 Mateusz Malinowski, Marcus Rohrbach, Mario Fritz

In contrast to previous efforts, we are facing a multi-modal problem where the language output (answer) is conditioned on visual and natural language input (image and question).

Natural Language Processing Question Answering

Hard to Cheat: A Turing Test based on Answering Questions about Images

no code implementations14 Jan 2015 Mateusz Malinowski, Mario Fritz

Progress in language and image understanding by machines has sparkled the interest of the research community in more open-ended, holistic tasks, and refueled an old AI dream of building intelligent machines.

Question Answering

A Pooling Approach to Modelling Spatial Relations for Image Retrieval and Annotation

no code implementations19 Nov 2014 Mateusz Malinowski, Mario Fritz

Over the last two decades we have witnessed strong progress on modeling visual object classes, scenes and attributes that have significantly contributed to automated image understanding.

Image Retrieval

Towards a Visual Turing Challenge

no code implementations29 Oct 2014 Mateusz Malinowski, Mario Fritz

As language and visual understanding by machines progresses rapidly, we are observing an increasing interest in holistic architectures that tightly interlink both modalities in a joint learning and inference process.

Question Answering

A Multi-World Approach to Question Answering about Real-World Scenes based on Uncertain Input

no code implementations NeurIPS 2014 Mateusz Malinowski, Mario Fritz

We propose a method for automatically answering questions about images by bringing together recent advances from natural language processing and computer vision.

Computer Vision Natural Language Processing +1

Learnable Pooling Regions for Image Classification

no code implementations15 Jan 2013 Mateusz Malinowski, Mario Fritz

Biologically inspired, from the early HMAX model to Spatial Pyramid Matching, pooling has played an important role in visual recognition pipelines.

Classification General Classification +2

Cannot find the paper you are looking for? You can Submit a new open access paper.