Search Results for author: Pranava Madhyastha

Found 34 papers, 14 papers with code

Towards a Unified Model for Generating Answers and Explanations in Visual Question Answering

no code implementations25 Jan 2023 Chenxi Whitehouse, Tillman Weyde, Pranava Madhyastha

The field of visual question answering (VQA) has recently seen a surge in research focused on providing explanations for predicted answers.

Explanation Generation Question Answering +1

Exploring the Long-Term Generalization of Counting Behavior in RNNs

no code implementations29 Nov 2022 Nadine El-Naggar, Pranava Madhyastha, Tillman Weyde

Despite this and some positive empirical results for LSTMs on Dyck-1 languages, our experimental results show that LSTMs fail to learn correct counting behavior for sequences that are significantly longer than in the training data.

Evaluation of Fake News Detection with Knowledge-Enhanced Language Models

1 code implementation1 Apr 2022 Chenxi Whitehouse, Tillman Weyde, Pranava Madhyastha, Nikos Komninos

The predominant state-of-the-art approaches are based on fine-tuning PLMs on labelled fake news datasets.

Fake News Detection

Numerical reasoning in machine reading comprehension tasks: are we there yet?

no code implementations EMNLP 2021 Hadeel Al-Negheimish, Pranava Madhyastha, Alessandra Russo

The current standings of these models in the DROP leaderboard, over standard metrics, suggest that the models have achieved near-human performance.

Machine Reading Comprehension

BERTGEN: Multi-task Generation through BERT

1 code implementation ACL 2021 Faidon Mitzalis, Ozan Caglayan, Pranava Madhyastha, Lucia Specia

We present BERTGEN, a novel generative, decoder-only model which extends BERT by fusing multimodal and multilingual pretrained models VL-BERT and M-BERT, respectively.

Image Captioning Multimodal Machine Translation +2

A call for better unit testing for invariant risk minimisation

1 code implementation6 Jun 2021 Chunyang Xiao, Pranava Madhyastha

In this paper we present a controlled study on the linearized IRM framework (IRMv1) introduced in Arjovsky et al. (2020).

MultiSubs: A Large-scale Multimodal and Multilingual Dataset

1 code implementation LREC 2022 Josiah Wang, Pranava Madhyastha, Josiel Figueiredo, Chiraag Lala, Lucia Specia

The dataset will benefit research on visual grounding of words especially in the context of free-form sentences, and can be obtained from https://doi. org/10. 5281/zenodo. 5034604 under a Creative Commons licence.

Multimodal Lexical Translation Multimodal Text Prediction +1

Exploiting Multimodal Reinforcement Learning for Simultaneous Machine Translation

1 code implementation EACL 2021 Julia Ive, Andy Mingren Li, Yishu Miao, Ozan Caglayan, Pranava Madhyastha, Lucia Specia

This paper addresses the problem of simultaneous machine translation (SiMT) by exploring two main concepts: (a) adaptive policies to learn a good trade-off between high translation quality and low latency; and (b) visual information to support this process by providing additional (visual) contextual information which may be available before the textual input is produced.

Machine Translation reinforcement-learning +2

MSVD-Turkish: A Comprehensive Multimodal Dataset for Integrated Vision and Language Research in Turkish

no code implementations13 Dec 2020 Begum Citamak, Ozan Caglayan, Menekse Kuyu, Erkut Erdem, Aykut Erdem, Pranava Madhyastha, Lucia Specia

We hope that the MSVD-Turkish dataset and the results reported in this work will lead to better video captioning and multimodal machine translation models for Turkish and other morphology rich and agglutinative languages.

Multimodal Machine Translation Translation +2

Simultaneous Machine Translation with Visual Context

1 code implementation EMNLP 2020 Ozan Caglayan, Julia Ive, Veneta Haralampieva, Pranava Madhyastha, Loïc Barrault, Lucia Specia

Simultaneous machine translation (SiMT) aims to translate a continuous input text stream into another language with the lowest latency and highest quality possible.

Machine Translation Translation

A Tweet-based Dataset for Company-Level Stock Return Prediction

no code implementations17 Jun 2020 Karolina Sowinska, Pranava Madhyastha

Public opinion influences events, especially related to stock market movement, in which a subtle hint can influence the local outcome of the market.


Deep Copycat Networks for Text-to-Text Generation

1 code implementation IJCNLP 2019 Julia Ive, Pranava Madhyastha, Lucia Specia

Most text-to-text generation tasks, for example text summarisation and text simplification, require copying words from the input to the output.

Automatic Post-Editing Text Generation +2

Imperial College London Submission to VATEX Video Captioning Task

no code implementations16 Oct 2019 Ozan Caglayan, Zixiu Wu, Pranava Madhyastha, Josiah Wang, Lucia Specia

This paper describes the Imperial College London team's submission to the 2019' VATEX video captioning challenge, where we first explore two sequence-to-sequence models, namely a recurrent (GRU) model and a transformer model, which generate captions from the I3D action features.

Video Captioning

On Model Stability as a Function of Random Seed

1 code implementation CONLL 2019 Pranava Madhyastha, Rishabh Jain

In this paper, we focus on quantifying model stability as a function of random seed by investigating the effects of the induced randomness on model performance and the robustness of the model in general.

Probing Representations Learned by Multimodal Recurrent and Transformer Models

no code implementations29 Aug 2019 Jindřich Libovický, Pranava Madhyastha

In this paper, we present a meta-study assessing the representational quality of models where the training signal is obtained from different modalities, in particular, language modeling, image features prediction, and both textual and multimodal machine translation.

Image Retrieval Language Modelling +4

Predicting Actions to Help Predict Translations

no code implementations5 Aug 2019 Zixiu Wu, Julia Ive, Josiah Wang, Pranava Madhyastha, Lucia Specia

The question we ask ourselves is whether visual features can support the translation process, in particular, given that this is a dataset extracted from videos, we focus on the translation of actions, which we believe are poorly captured in current static image-text datasets currently used for multimodal translation.


VIFIDEL: Evaluating the Visual Fidelity of Image Descriptions

no code implementations ACL 2019 Pranava Madhyastha, Josiah Wang, Lucia Specia

It estimates the faithfulness of a generated caption with respect to the content of the actual image, based on the semantic similarity between labels of objects depicted in images and words in the description.

Semantic Similarity Semantic Textual Similarity

Distilling Translations with Visual Awareness

1 code implementation ACL 2019 Julia Ive, Pranava Madhyastha, Lucia Specia

Previous work on multimodal machine translation has shown that visual information is only needed in very specific cases, for example in the presence of ambiguous words where the textual context is not sufficient.

Ranked #2 on Multimodal Machine Translation on Multi30K (Meteor (EN-FR) metric)

Multimodal Machine Translation Translation

Model Explanations under Calibration

1 code implementation18 Jun 2019 Rishabh Jain, Pranava Madhyastha

Explaining and interpreting the decisions of recommender systems are becoming extremely relevant both, for improving predictive performance, and providing valid explanations to users.

Recommendation Systems

Grounded Word Sense Translation

no code implementations WS 2019 Chiraag Lala, Pranava Madhyastha, Lucia Specia

Recent work on visually grounded language learning has focused on broader applications of grounded representations, such as visual question answering and multimodal machine translation.

Grounded language learning Multimodal Machine Translation +3

Probing the Need for Visual Context in Multimodal Machine Translation

no code implementations NAACL 2019 Ozan Caglayan, Pranava Madhyastha, Lucia Specia, Loïc Barrault

Current work on multimodal machine translation (MMT) has suggested that the visual modality is either unnecessary or only marginally beneficial.

Multimodal Machine Translation Translation

Learning from Multiview Correlations in Open-Domain Videos

no code implementations21 Nov 2018 Nils Holzenberger, Shruti Palaskar, Pranava Madhyastha, Florian Metze, Raman Arora

This shows it is possible to learn reliable representations across disparate, unaligned and noisy modalities, and encourages using the proposed approach on larger datasets.

Representation Learning Retrieval

End-to-end Image Captioning Exploits Multimodal Distributional Similarity

no code implementations11 Sep 2018 Pranava Madhyastha, Josiah Wang, Lucia Specia

We hypothesize that end-to-end neural image captioning systems work seemingly well because they exploit and learn `distributional similarity' in a multimodal feature space by mapping a test image to similar training images in this space and generating a caption from the same space.

Image Captioning Text Generation

Defoiling Foiled Image Captions

1 code implementation NAACL 2018 Pranava Madhyastha, Josiah Wang, Lucia Specia

We address the task of detecting foiled image captions, i. e. identifying whether a caption contains a word that has been deliberately replaced by a semantically similar word, thus rendering it inaccurate with respect to the image being described.

Image Captioning

Object Counts! Bringing Explicit Detections Back into Image Captioning

no code implementations NAACL 2018 Josiah Wang, Pranava Madhyastha, Lucia Specia

The use of explicit object detectors as an intermediate step to image captioning - which used to constitute an essential stage in early work - is often bypassed in the currently dominant end-to-end approaches, where the language model is conditioned directly on a mid-level image embedding.

Image Captioning Language Modelling

What is image captioning made of?

1 code implementation ICLR 2018 Pranava Madhyastha, Josiah Wang, Lucia Specia

We hypothesize that end-to-end neural image captioning systems work seemingly well because they exploit and learn ‘distributional similarity’ in a multimodal feature space, by mapping a test image to similar training images in this space and generating a caption from the same space.

Image Captioning Text Generation

Cannot find the paper you are looking for? You can Submit a new open access paper.