Search Results for author: Oswald Lanz

Found 26 papers, 10 papers with code

Your Image is My Video: Reshaping the Receptive Field via Image-To-Video Differentiable AutoAugmentation and Fusion

no code implementations • 22 Mar 2024 • Sofia Casarin, Cynthia I. Ugwu, Sergio Escalera, Oswald Lanz

The landscape of deep learning research is moving towards innovative strategies to harness the true potential of data.

Data Augmentation Image Classification +1

Paper
Add Code

Inductive Attention for Video Action Anticipation

no code implementations • 17 Dec 2022 • Tsung-Ming Tai, Giuseppe Fiameni, Cheng-Kuang Lee, Simon See, Oswald Lanz

Consequently, existing solutions based on the action recognition models are only suboptimal.

Action Anticipation Action Recognition +1

Paper
Add Code

A Feature-space Multimodal Data Augmentation Technique for Text-video Retrieval

1 code implementation • 3 Aug 2022 • Alex Falcon, Giuseppe Serra, Oswald Lanz

Data augmentation techniques were introduced to increase the performance on unseen test examples by creating new training samples with the application of semantics-preserving techniques, such as color space or geometric transformations on images.

Data Augmentation Retrieval +1

Paper
Code

UniUD-FBK-UB-UniBZ Submission to the EPIC-Kitchens-100 Multi-Instance Retrieval Challenge 2022

no code implementations • 22 Jun 2022 • Alex Falcon, Giuseppe Serra, Sergio Escalera, Oswald Lanz

This report presents the technical details of our submission to the EPIC-Kitchens-100 Multi-Instance Retrieval Challenge 2022.

Ranked #3 on Multi-Instance Retrieval on EPIC-KITCHENS-100

Multi-Instance Retrieval Retrieval

Paper
Add Code

NVIDIA-UNIBZ Submission for EPIC-KITCHENS-100 Action Anticipation Challenge 2022

no code implementations • 22 Jun 2022 • Tsung-Ming Tai, Oswald Lanz, Giuseppe Fiameni, Yi-Kwan Wong, Sze-Sen Poon, Cheng-Kuang Lee, Ka-Chun Cheung, Simon See

In this report, we describe the technical details of our submission for the EPIC-Kitchen-100 action anticipation challenge.

Action Anticipation

Paper
Add Code

Unified Recurrence Modeling for Video Action Anticipation

1 code implementation • 2 Jun 2022 • Tsung-Ming Tai, Giuseppe Fiameni, Cheng-Kuang Lee, Simon See, Oswald Lanz

To this end, we propose a unified recurrence modeling for video action anticipation via message passing framework.

Action Anticipation Decision Making

Paper
Code

Relevance-based Margin for Contrastively-trained Video Retrieval Models

1 code implementation • 27 Apr 2022 • Alex Falcon, Swathikiran Sudhakaran, Giuseppe Serra, Sergio Escalera, Oswald Lanz

We show that even if we carefully tuned the fixed margin, our technique (which does not have the margin as a hyper-parameter) would still achieve better performance.

Ranked #7 on Multi-Instance Retrieval on EPIC-KITCHENS-100

Multi-Instance Retrieval Natural Language Queries +2

Paper
Code

Learning video retrieval models with relevance-aware online mining

2 code implementations • 16 Mar 2022 • Alex Falcon, Giuseppe Serra, Oswald Lanz

Due to the amount of videos and related captions uploaded every hour, deep learning-based solutions for cross-modal video retrieval are attracting more and more attention.

Ranked #5 on Multi-Instance Retrieval on EPIC-KITCHENS-100

Multi-Instance Retrieval Retrieval +2

Paper
Code

Gate-Shift-Fuse for Video Action Recognition

1 code implementation • 16 Mar 2022 • Swathikiran Sudhakaran, Sergio Escalera, Oswald Lanz

3D kernel factorization approaches have been proposed to reduce the complexity of 3D CNNs.

Ranked #17 on Action Recognition on EPIC-KITCHENS-100 (using extra training data)

Action Recognition Temporal Action Localization +1

Paper
Code

SAIC_Cambridge-HuPBA-FBK Submission to the EPIC-Kitchens-100 Action Recognition Challenge 2021

no code implementations • 6 Oct 2021 • Swathikiran Sudhakaran, Adrian Bulat, Juan-Manuel Perez-Rua, Alex Falcon, Sergio Escalera, Oswald Lanz, Brais Martinez, Georgios Tzimiropoulos

This report presents the technical details of our submission to the EPIC-Kitchens-100 Action Recognition Challenge 2021.

Action Recognition Temporal Action Localization

Paper
Add Code

Higher Order Recurrent Space-Time Transformer for Video Action Prediction

1 code implementation • 17 Apr 2021 • Tsung-Ming Tai, Giuseppe Fiameni, Cheng-Kuang Lee, Oswald Lanz

Endowing visual agents with predictive capability is a key step towards video intelligence at scale.

Action Anticipation Action Recognition +1

Paper
Code

Learning to Recognize Actions on Objects in Egocentric Video with Attention Dictionaries

no code implementations • 16 Feb 2021 • Swathikiran Sudhakaran, Sergio Escalera, Oswald Lanz

We present EgoACO, a deep neural architecture for video action recognition that learns to pool action-context-object descriptors from frame level features by leveraging the verb-noun structure of action labels in egocentric video datasets.

Action Recognition Object +1

Paper
Add Code

Data augmentation techniques for the Video Question Answering task

no code implementations • 22 Aug 2020 • Alex Falcon, Oswald Lanz, Giuseppe Serra

Video Question Answering (VideoQA) is a task that requires a model to analyze and understand both the visual content given by the input video and the textual part given by the question, and the interaction between them in order to produce a meaningful answer.

Data Augmentation Question Answering +1

Paper
Add Code

Novel-View Human Action Synthesis

1 code implementation • 6 Jul 2020 • Mohamed Ilyes Lakhal, Davide Boscaini, Fabio Poiesi, Oswald Lanz, Andrea Cavallaro

We first estimate the 3D mesh of the target body and transfer the rough textures from the 2D images to the mesh.

Novel View Synthesis Video Generation

Paper
Code

FBK-HUPBA Submission to the EPIC-Kitchens Action Recognition 2020 Challenge

no code implementations • 24 Jun 2020 • Swathikiran Sudhakaran, Sergio Escalera, Oswald Lanz

In this report we describe the technical details of our submission to the EPIC-Kitchens Action Recognition 2020 Challenge.

Action Recognition

Paper
Add Code

Gate-Shift Networks for Video Action Recognition

2 code implementations • CVPR 2020 • Swathikiran Sudhakaran, Sergio Escalera, Oswald Lanz

Deep 3D CNNs for video action recognition are designed to learn powerful representations in the joint spatio-temporal feature space.

Ranked #26 on Action Recognition on Something-Something V1 (using extra training data)

Action Recognition

150

Paper
Code

An Analysis of Deep Neural Networks with Attention for Action Recognition from a Neurophysiological Perspective

no code implementations • 2 Jul 2019 • Swathikiran Sudhakaran, Oswald Lanz

We review three recent deep learning based methods for action recognition and present a brief comparative analysis of the methods from a neurophyisiological point of view.

Action Recognition

Paper
Add Code

FBK-HUPBA Submission to the EPIC-Kitchens 2019 Action Recognition Challenge

no code implementations • 21 Jun 2019 • Swathikiran Sudhakaran, Sergio Escalera, Oswald Lanz

In this report we describe the technical details of our submission to the EPIC-Kitchens 2019 action recognition challenge.

Action Recognition

Paper
Add Code

Hierarchical Feature Aggregation Networks for Video Action Recognition

no code implementations • 29 May 2019 • Swathikiran Sudhakaran, Sergio Escalera, Oswald Lanz

Most action recognition methods base on a) a late aggregation of frame level CNN features using average pooling, max pooling, or RNN, among others, or b) spatio-temporal aggregation via 3D convolutions.

Ranked #51 on Action Recognition on HMDB-51 (using extra training data)

Action Recognition Temporal Action Localization

Paper
Add Code

LSTA: Long Short-Term Attention for Egocentric Action Recognition

1 code implementation • CVPR 2019 • Swathikiran Sudhakaran, Sergio Escalera, Oswald Lanz

Egocentric activity recognition is one of the most challenging tasks in video analysis.

Ranked #5 on Egocentric Activity Recognition on EGTEA

Action Recognition Egocentric Activity Recognition +1

Paper
Code

Top-down Attention Recurrent VLAD Encoding for Action Recognition in Videos

no code implementations • 29 Aug 2018 • Swathikiran Sudhakaran, Oswald Lanz

Most recent approaches for action recognition from video leverage deep architectures to encode the video clip into a fixed length representation vector that is then used for classification.

Action Recognition In Videos General Classification +2

Paper
Add Code

Attention is All We Need: Nailing Down Object-centric Attention for Egocentric Activity Recognition

1 code implementation • 31 Jul 2018 • Swathikiran Sudhakaran, Oswald Lanz

Our model is built on the observation that egocentric activities are highly characterized by the objects and their locations in the video.

Ranked #6 on Egocentric Activity Recognition on EGTEA

Egocentric Activity Recognition Hand Segmentation

Paper
Code

Convolutional Long Short-Term Memory Networks for Recognizing First Person Interactions

no code implementations • 19 Sep 2017 • Swathikiran Sudhakaran, Oswald Lanz

The proposed approach uses a pair of convolutional neural networks, whose parameters are shared, for extracting frame level features from successive frames of the video.

Paper
Add Code

Learning to Detect Violent Videos using Convolutional Long Short-Term Memory

no code implementations • 19 Sep 2017 • Swathikiran Sudhakaran, Oswald Lanz

A convolutional neural network is used to extract frame level features from a video.

Paper
Add Code

Uncovering Interactions and Interactors: Joint Estimation of Head, Body Orientation and F-Formations From Surveillance Videos

no code implementations • ICCV 2015 • Elisa Ricci, Jagannadan Varadarajan, Ramanathan Subramanian, Samuel Rota Bulo, Narendra Ahuja, Oswald Lanz

We present a novel approach for jointly estimating tar- gets' head, body orientations and conversational groups called F-formations from a distant social scene (e. g., a cocktail party captured by surveillance cameras).

TAR

Paper
Add Code

SALSA: A Novel Dataset for Multimodal Group Behavior Analysis

no code implementations • 23 Jun 2015 • Xavier Alameda-Pineda, Jacopo Staiano, Ramanathan Subramanian, Ligia Batrinca, Elisa Ricci, Bruno Lepri, Oswald Lanz, Nicu Sebe

Studying free-standing conversational groups (FCGs) in unstructured social settings (e. g., cocktail party ) is gratifying due to the wealth of information available at the group (mining social networks) and individual (recognizing native behavioral and personality traits) levels.

Paper
Add Code

Cannot find the paper you are looking for? You can Submit a new open access paper.