Search Results for author: Gloria Haro

Found 19 papers, 9 papers with code

Speech inpainting: Context-based speech synthesis guided by video

no code implementations1 Jun 2023 Juan F. Montesinos, Daniel Michelsanti, Gloria Haro, Zheng-Hua Tan, Jesper Jensen

Audio and visual modalities are inherently connected in speech signals: lip movements and facial expressions are correlated with speech sounds.

speech-recognition Speech Recognition +1

A Graph-Based Method for Soccer Action Spotting Using Unsupervised Player Classification

1 code implementation22 Nov 2022 Alejandro Cartas, Coloma Ballester, Gloria Haro

Action spotting in soccer videos is the task of identifying the specific time when a certain key action of the game occurs.

Action Spotting

VocaLiST: An Audio-Visual Synchronisation Model for Lips and Voices

1 code implementation5 Apr 2022 Venkatesh S. Kadandale, Juan F. Montesinos, Gloria Haro

Finally, we use the frozen visual features learned by our lip synchronisation model in the singing voice separation task to outperform a baseline audio-visual model which was trained end-to-end.

Audio-Visual Synchronization Music Source Separation

Learning Football Body-Orientation as a Matter of Classification

no code implementations1 Jun 2021 Adrià Arbués-Sangüesa, Adrián Martín, Paulino Granero, Coloma Ballester, Gloria Haro

Orientation is a crucial skill for football players that becomes a differential factor in a large set of events, especially the ones involving passes.

Classification

Self-Supervised Small Soccer Player Detection and Tracking

1 code implementation20 Nov 2020 Samuel Hurault, Coloma Ballester, Gloria Haro

In a soccer game, the information provided by detecting and tracking brings crucial clues to further analyze and understand some tactical aspects of the game, including individual and team actions.

Solos: A Dataset for Audio-Visual Music Analysis

1 code implementation14 Jun 2020 Juan F. Montesinos, Olga Slizovskaia, Gloria Haro

In this paper, we present a new dataset of music performance videos which can be used for training machine learning methods for multiple tasks such as audio-visual blind source separation and localization, cross-modal correspondences, cross-modal generation and, in general, any audio-visual selfsupervised task.

Audio Source Separation Audio-Visual Synchronization +1 Audio and Speech Processing Databases Sound

Using Player's Body-Orientation to Model Pass Feasibility in Soccer

no code implementations15 Apr 2020 Adrià Arbués-Sangüesa, Adrián Martín, Javier Fernández, Coloma Ballester, Gloria Haro

Given a monocular video of a soccer match, this paper presents a computational model to estimate the most feasible pass at any given time.

Decision Making

Conditioned Source Separation for Music Instrument Performances

1 code implementation8 Apr 2020 Olga Slizovskaia, Gloria Haro, Emilia Gómez

In music source separation, the number of sources may vary for each piece and some of the sources may belong to the same family of instruments, thus sharing timbral characteristics and making the sources more correlated.

Sound Audio and Speech Processing

Multi-channel U-Net for Music Source Separation

2 code implementations23 Mar 2020 Venkatesh S. Kadandale, Juan F. Montesinos, Gloria Haro, Emilia Gómez

However, Conditioned U-Net (C-U-Net) uses a control mechanism to train a single model for multi-source separation and attempts to achieve a performance comparable to that of the dedicated models.

Music Source Separation

Always Look on the Bright Side of the Field: Merging Pose and Contextual Data to Estimate Orientation of Soccer Players

no code implementations2 Mar 2020 Adrià Arbués-Sangüesa, Adrián Martín, Javier Fernández, Carlos Rodríguez, Gloria Haro, Coloma Ballester

Although orientation has proven to be a key skill of soccer players in order to succeed in a broad spectrum of plays, body orientation is a yet-little-explored area in sports analytics' research.

Sports Analytics Super-Resolution

Multi-Person tracking by multi-scale detection in Basketball scenarios

no code implementations10 Jul 2019 Adrià Arbués-Sangüesa, Gloria Haro, Coloma Ballester

The presented system could be used as a source of data gathering in order to extract useful statistics and semantic analyses a posteriori.

A Case Study of Deep-Learned Activations via Hand-Crafted Audio Features

no code implementations3 Jul 2019 Olga Slizovskaia, Emilia Gómez, Gloria Haro

We also propose a technique for measuring the similarity between activation maps and audio features which typically presented in the form of a matrix, such as chromagrams or spectrograms.

Single-Camera Basketball Tracker through Pose and Semantic Feature Fusion

no code implementations5 Jun 2019 Adrià Arbués-Sangüesa, Coloma Ballester, Gloria Haro

Tracking sports players is a widely challenging scenario, specially in single-feed videos recorded in tight courts, where cluttering and occlusions cannot be avoided.

End-to-End Sound Source Separation Conditioned On Instrument Labels

no code implementations5 Nov 2018 Olga Slizovskaia, Leo Kim, Gloria Haro, Emilia Gomez

Can we perform an end-to-end music source separation with a variable number of sources using a deep learning model?

Music Source Separation

FALDOI: A new minimization strategy for large displacement variational optical flow

1 code implementation29 Feb 2016 Roberto P. Palomares, Enric Meinhardt-Llopis, Coloma Ballester, Gloria Haro

We propose a large displacement optical flow method that introduces a new strategy to compute a good local minimum of any optical flow energy functional.

Optical Flow Estimation

A Computational Model for Amodal Completion

no code implementations26 Nov 2015 Maria Oliver, Gloria Haro, Mariella Dimiccoli, Baptiste Mazin, Coloma Ballester

This paper presents a computational model to recover the most likely interpretation of the 3D scene structure from a planar image, where some objects may occlude others.

Cannot find the paper you are looking for? You can Submit a new open access paper.