no code implementations • CVPR 2014 • Kenneth Alberto Funes Mora, Jean-Marc Odobez
We propose a head pose invariant gaze estimation model for distant RGB-D cameras.
no code implementations • 10 Jul 2017 • Nam Le, Jean-Marc Odobez
Learning speaker turn embeddings has shown considerable improvement in situations where conventional speaker modeling approaches fail.
1 code implementation • 30 Nov 2017 • Weipeng He, Petr Motlicek, Jean-Marc Odobez
We propose to use neural networks for simultaneous detection and localization of multiple sound sources in human-robot interaction.
no code implementations • 6 Dec 2018 • Nam Le, Jean-Marc Odobez
Collecting labeled data to train deep neural networks is costly and even impractical for many tasks.
no code implementations • 20 Apr 2019 • Gang Liu, Yu Yu, Kenneth A. Funes Mora, Jean-Marc Odobez
Non-invasive gaze estimation methods usually regress gaze directions directly from a single face or eye image.
no code implementations • CVPR 2019 • Yu Yu, Gang Liu, Jean-Marc Odobez
In this work, we address the problem of person-specific gaze model adaptation from only a few reference training samples.
no code implementations • 15 Sep 2019 • Mary Ellen Foster, Bart Craenen, Amol Deshmukh, Oliver Lemon, Emanuele Bastianelli, Christian Dondrup, Ioannis Papaioannou, Andrea Vanzo, Jean-Marc Odobez, Olivier Canévet, Yuanzhouhan Cao, Weipeng He, Angel Martínez-González, Petr Motlicek, Rémy Siegfried, Rachid Alami, Kathleen Belhassein, Guilhem Buisan, Aurélie Clodic, Amandine Mayima, Yoan Sallami, Guillaume Sarthou, Phani-Teja Singamaneni, Jules Waldhart, Alexandre Mazel, Maxime Caniot, Marketta Niemelä, Päivi Heikkilä, Hanna Lammi, Antti Tammela
In the EU-funded MuMMER project, we have developed a social robot designed to interact naturally and flexibly with users in public spaces such as a shopping mall.
no code implementations • 30 Oct 2019 • Angel Martínez-González, Michael Villamizar, Olivier Canévet, Jean-Marc Odobez
(i) we propose a fast and efficient network based on residual blocks (called RPM) for body landmark localization from depth images; (ii) we created a public dataset DIH comprising more than 170k synthetic images of human bodies with various shapes and viewpoints as well as real (annotated) data for evaluation; (iii) we show that our model trained on synthetic data from scratch can perform well on real data, obtaining similar results to larger models initialized with pre-trained networks.
no code implementations • CVPR 2020 • Yu Yu, Jean-Marc Odobez
Although automatic gaze estimation is very important to a large variety of application areas, it is difficult to train accurate and robust gaze models, in great part due to the difficulty in collecting large and diverse data (annotating 3D gaze is expensive and existing datasets use different setups).
no code implementations • 2 Dec 2019 • Angel Martínez-González, Michael Villamizar, Olivier Canévet, Jean-Marc Odobez
i) we study several CNN architecture designs combining pose machines relying on the cascade of detectors concept with lightweight and efficient CNN structures; ii) to address the need for large training datasets with high variability, we rely on semi-synthetic data combining multi-person synthetic depth data with real sensor backgrounds; iii) we explore domain adaptation techniques to address the performance gap introduced by testing on real depth images; iv) to increase the accuracy of our fast lightweight CNN models, we investigate knowledge distillation at several architecture levels which effectively enhance performance.
1 code implementation • 4 Nov 2020 • Yihui Fu, Zhuoyuan Yao, Weipeng He, Jian Wu, Xiong Wang, Zhanheng Yang, Shimin Zhang, Lei Xie, DongYan Huang, Hui Bu, Petr Motlicek, Jean-Marc Odobez
In this challenge, we open source a sizable speech, keyword, echo and noise corpus for promoting data-driven methods, particularly deep-learning approaches on KWS and SSL.
Sound Audio and Speech Processing
1 code implementation • 10 Nov 2020 • Angel Martínez-González, Michael Villamizar, Olivier Canévet, Jean-Marc Odobez
We propose to leverage recent advances in reliable 2D pose estimation with Convolutional Neural Networks (CNN) to estimate the 3D pose of people from depth images in multi-person Human-Robot Interaction (HRI) scenarios.
1 code implementation • 2 Aug 2021 • Marco Ewerton, Angel Martínez-González, Jean-Marc Odobez
In this paper, we propose to frame the learning of pushing policies (where to push and how) by DQNs as an image-to-image translation problem and exploit an Hourglass-based architecture.
1 code implementation • 15 Sep 2021 • Angel Martínez-González, Michael Villamizar, Jean-Marc Odobez
We propose to leverage Transformer architectures for non-autoregressive human motion prediction.
no code implementations • ICCV 2023 • Samy Tafasca, Anshul Gupta, Jean-Marc Odobez
Furthermore, all publicly available gaze target prediction benchmarks mostly contain instances of adults, which makes models trained on them less applicable to scenarios with young children.
1 code implementation • IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops 2022 • Anshul Gupta, Samy Tafasca, Jean-Marc Odobez
to detect obstructions in the line of sight or apply attention priors that humans typically have when observing others.
no code implementations • 1 Oct 2023 • Samy Tafasca, Anshul Gupta, Jean-Marc Odobez
In this paper, we introduce a novel transformer-based architecture for 2D gaze prediction.
no code implementations • 15 Mar 2024 • Anshul Gupta, Samy Tafasca, Arya Farkhondeh, Pierre Vuillecard, Jean-Marc Odobez
Gaze following and social gaze prediction are fundamental tasks providing insights into human communication behaviors, intent, and social interactions.