Search Results for author: Stéphane Dupont

Found 22 papers, 6 papers with code

An empirical study on the effectiveness of images in Multimodal Neural Machine Translation

no code implementations • EMNLP 2017 • Jean-Benoit Delbrouck, Stéphane Dupont

In state-of-the-art Neural Machine Translation (NMT), an attention mechanism is used during decoding to enhance the translation.

Machine Translation NMT +2

Paper
Add Code

Visually Grounded Word Embeddings and Richer Visual Features for Improving Multimodal Neural Machine Translation

no code implementations • 4 Jul 2017 • Jean-Benoit Delbrouck, Stéphane Dupont, Omar Seddati

In Multimodal Neural Machine Translation (MNMT), a neural model generates a translated sentence that describes an image, given the image itself and one source descriptions in English.

Dense Captioning Machine Translation +5

Paper
Add Code

Modulating and attending the source image during encoding improves Multimodal Translation

1 code implementation • 9 Dec 2017 • Jean-Benoit Delbrouck, Stéphane Dupont

We propose a new and fully end-to-end approach for multimodal translation where the source text encoder modulates the entire visual input processing using conditional batch normalization, in order to compute the most informative image features for our task.

Translation

Paper
Code

Proceedings of eNTERFACE 2015 Workshop on Intelligent Interfaces

no code implementations • 19 Jan 2018 • Matei Mancas, Christian Frisson, Joëlle Tilmanne, Nicolas D'Alessandro, Petr Barborka, Furkan Bayansar, Francisco Bernard, Rebecca Fiebrink, Alexis Heloir, Edgar Hemery, Sohaib Laraba, Alexis Moinet, Fabrizio Nunnari, Thierry Ravet, Loïc Reboursière, Alvaro Sarasua, Mickaël Tits, Noé Tits, François Zajéga, Paolo Alborno, Ksenia Kolykhalova, Emma Frid, Damiano Malafronte, Lisanne Huis in't Veld, Hüseyin Cakmak, Kevin El Haddad, Nicolas Riche, Julien Leroy, Pierre Marighetto, Bekir Berker Türker, Hossein Khaki, Roberto Pulisci, Emer Gilmartin, Fasih Haider, Kübra Cengiz, Martin Sulir, Ilaria Torre, Shabbir Marzban, Ramazan Yazıcı, Furkan Burak Bâgcı, Vedat Gazi Kılı, Hilal Sezer, Sena Büsra Yenge, Charles-Alexandre Delestage, Sylvie Leleu-Merviel, Muriel Meyer-Chemenska, Daniel Schmitt, Willy Yvart, Stéphane Dupont, Ozan Can Altiok, Aysegül Bumin, Ceren Dikmen, Ivan Giangreco, Silvan Heller, Emre Külah, Gueorgui Pironkov, Luca Rossetto, Yusuf Sahillioglu, Heiko Schuldt, Omar Seddati, Yusuf Setinkaya, Metin Sezgin, Claudiu Tanase, Emre Toyan, Sean Wood, Doguhan Yeke, Françcois Rocca, Pierre-Henri De Deken, Alessandra Bandrabur, Fabien Grisard, Axel Jean-Caurant, Vincent Courboulay, Radhwan Ben Madhkour, Ambroise Moreau

The 11th Summer Workshop on Multimodal Interfaces eNTERFACE 2015 was hosted by the Numediart Institute of Creative Technologies of the University of Mons from August 10th to September 2015.

Paper
Add Code

UMONS Submission for WMT18 Multimodal Translation Task

1 code implementation • 15 Oct 2018 • Jean-Benoit Delbrouck, Stéphane Dupont

This paper describes the UMONS solution for the Multimodal Machine Translation Task presented at the third conference on machine translation (WMT18).

Image Captioning Multimodal Machine Translation +1

Paper
Code

Bringing back simplicity and lightliness into neural image captioning

no code implementations • 15 Oct 2018 • Jean-Benoit Delbrouck, Stéphane Dupont

So far, the goal has been to maximize scores on automated metric and to do so, one has to come up with a plurality of new modules and techniques.

Caption Generation Image Captioning +2

Paper
Add Code

Object-oriented Targets for Visual Navigation using Rich Semantic Representations

no code implementations • 22 Nov 2018 • Jean-Benoit Delbrouck, Stéphane Dupont

When searching for an object humans navigate through a scene using semantic information and spatial relationships.

Navigate Object +1

Paper
Add Code

Adversarial reconstruction for Multi-modal Machine Translation

no code implementations • 7 Oct 2019 • Jean-Benoit Delbrouck, Stéphane Dupont

Even with the growing interest in problems at the intersection of Computer Vision and Natural Language, grounding (i. e. identifying) the components of a structured description in an image still remains a challenging task.

Machine Translation Translation

Paper
Add Code

Modulated Self-attention Convolutional Network for VQA

no code implementations • 8 Oct 2019 • Jean-Benoit Delbrouck, Antoine Maiorca, Nathan Hubens, Stéphane Dupont

As new data-sets for real-world visual reasoning and compositional question answering are emerging, it might be needed to use the visual feature extraction as a end-to-end process during training.

Question Answering Visual Question Answering +1

Paper
Add Code

Can adversarial training learn image captioning ?

1 code implementation • 31 Oct 2019 • Jean-Benoit Delbrouck, Bastien Vanderplaetse, Stéphane Dupont

Recently, generative adversarial networks (GAN) have gathered a lot of interest.

Image Captioning Text Generation

Paper
Code

A Transformer-based joint-encoding for Emotion Recognition and Sentiment Analysis

1 code implementation • WS 2020 • Jean-Benoit Delbrouck, Noé Tits, Mathilde Brousmiche, Stéphane Dupont

Understanding expressed sentiment and emotions are two crucial factors in human multimodal language.

Ranked #5 on Multimodal Sentiment Analysis on CMU-MOSEI (using extra training data)

Emotion Recognition Multimodal Sentiment Analysis

110

Paper
Code

AVECL-UMONS database for audio-visual event classification and localization

no code implementations • 2 Oct 2020 • Mathilde Brousmiche, Stéphane Dupont, Jean Rouat

We introduce the AVECL-UMons dataset for audio-visual event classification and localization in the context of office environments.

General Classification

Paper
Add Code

Modulated Fusion using Transformer for Linguistic-Acoustic Emotion Recognition

1 code implementation • EMNLP (nlpbt) 2020 • Jean-Benoit Delbrouck, Noé Tits, Stéphane Dupont

This paper aims to bring a new lightweight yet powerful solution for the task of Emotion Recognition and Sentiment Analysis.

Ranked #6 on Multimodal Sentiment Analysis on CMU-MOSEI (using extra training data)

Emotion Recognition Multimodal Sentiment Analysis

Paper
Code

Improved Soccer Action Spotting using both Audio and Video Streams

no code implementations • 9 Nov 2020 • Bastien Vanderplaetse, Stéphane Dupont

Action spotting and classification are the tasks that consist in finding the temporal anchors of events in a video and determine which event they are.

Ranked #5 on Action Spotting on SoccerNet

Action Classification Action Spotting +2

Paper
Add Code

Multi-level Attention Fusion Network for Audio-visual Event Recognition

1 code implementation • 12 Jun 2021 • Mathilde Brousmiche, Jean Rouat, Stéphane Dupont

Event classification is inherently sequential and multimodal.

Paper
Code

Deep soccer captioning with transformer: dataset, semantics-related losses, and multi-level evaluation

no code implementations • 11 Feb 2022 • Ahmad Hammoudeh, Bastien Vanderplaetse, Stéphane Dupont

This work aims at generating captions for soccer videos using deep learning.

Optical Flow Estimation

Paper
Add Code

Analysis of Co-Laughter Gesture Relationship on RGB videos in Dyadic Conversation Contex

no code implementations • 20 May 2022 • Hugo Bohy, Ahmad Hammoudeh, Antoine Maiorca, Stéphane Dupont, Thierry Dutoit

Laughter is not just an audio signal, but an intrinsic relationship of multimodal non-verbal communication, in addition to audio, it includes facial expressions and body movements.

Motion Synthesis

Paper
Add Code

Transformers and CNNs both Beat Humans on SBIR

no code implementations • 14 Sep 2022 • Omar Seddati, Stéphane Dupont, Saïd Mahmoudi, Thierry Dutoit

Sketch-based image retrieval (SBIR) is the task of retrieving natural images (photos) that match the semantics and the spatial configuration of hand-drawn sketch queries.

Retrieval Sketch-Based Image Retrieval

Paper
Add Code

A Recipe for Efficient SBIR Models: Combining Relative Triplet Loss with Batch Normalization and Knowledge Distillation

no code implementations • 30 May 2023 • Omar Seddati, Nathan Hubens, Stéphane Dupont, Thierry Dutoit

Then, we introduce a Relative Triplet Loss (RTL), an adapted triplet loss to overcome those limitations through loss weighting based on anchors similarity.

Data Augmentation Knowledge Distillation +2

Paper
Add Code

Deep learning in medical image registration: introduction and survey

no code implementations • 1 Sep 2023 • Ahmad Hammoudeh, Stéphane Dupont

Image registration (IR) is a process that deforms images to align them with respect to a reference space, making it easier for medical practitioners to examine various medical images in a standardized reference frame, such as having the same rotation and scale.

Image Registration Medical Image Registration

Paper
Add Code

Analysis of Co-Laughter Gesture Relationship on RGB Videos in Dyadic Conversation Context

no code implementations • SmiLa (LREC) 2022 • Hugo Bohy, Ahmad Hammoudeh, Antoine Maiorca, Stéphane Dupont, Thierry Dutoit

Laughter is not just an audio signal, but an intrinsic relationship of multimodal non-verbal communication, in addition to audio, it includes facial expressions and body movements.

Motion Synthesis

Paper
Add Code

Are There Any Body-movement Differences between Women and Men When They Laugh?

no code implementations • SmiLa (LREC) 2022 • Ahmad Hammoudeh, Antoine Maiorca, Stéphane Dupont, Thierry Dutoit

Women smile more than men although the expressiveness of women is not universally more across all facial actions.

Pose Estimation

Paper
Add Code

Cannot find the paper you are looking for? You can Submit a new open access paper.