Search Results for author: Walterio Mayol-Cuevas

Found 24 papers, 5 papers with code

Are you Struggling? Dataset and Baselines for Struggle Determination in Assembly Videos

no code implementations16 Feb 2024 Shijia Feng, Michael Wray, Brian Sullivan, Youngkyoon Jang, Casimir Ludwig, Iain Gilchrist, Walterio Mayol-Cuevas

Determining when people are struggling from video enables a finer-grained understanding of actions and opens opportunities for building intelligent support visual interfaces.

Decision Making Video Understanding

SuperTran: Reference Based Video Transformer for Enhancing Low Bitrate Streams in Real Time

no code implementations22 Nov 2022 Tejas Khot, Nataliya Shapovalova, Silviu Andrei, Walterio Mayol-Cuevas

This work focuses on low bitrate video streaming scenarios (e. g. 50 - 200Kbps) where the video quality is severely compromised.

Super-Resolution

AROS: Affordance Recognition with One-Shot Human Stances

no code implementations21 Oct 2022 Abel Pacheco-Ortega, Walterio Mayol-Cuevas

We present AROS, a one-shot learning approach that uses an explicit representation of interactions between highly-articulated human poses and 3D scenes.

Affordance Recognition One-Shot Learning

On-Sensor Binarized Fully Convolutional Neural Network with A Pixel Processor Array

no code implementations2 Feb 2022 Yanan Liu, Laurie Bose, Yao Lu, Piotr Dudek, Walterio Mayol-Cuevas

This work presents a method to implement fully convolutional neural networks (FCNs) on Pixel Processor Array (PPA) sensors, and demonstrates coarse segmentation and object localisation tasks.

Binarization Object +2

Direct Servo Control from In-Sensor CNN Inference with A Pixel Processor Array

no code implementations26 May 2021 Yanan Liu, Jianing Chen, Laurie Bose, Piotr Dudek, Walterio Mayol-Cuevas

This work demonstrates direct visual sensory-motor control using high-speed CNN inference via a SCAMP-5 Pixel Processor Array (PPA).

Classification

Filter Distribution Templates in Convolutional Networks for Image Classification Tasks

no code implementations28 Apr 2021 Ramon Izquierdo-Cordova, Walterio Mayol-Cuevas

Neural network designers have reached progressive accuracy by increasing models depth, introducing new layer types and discovering new combinations of layers.

Classification General Classification +1

Towards Efficient Convolutional Network Models with Filter Distribution Templates

no code implementations17 Apr 2021 Ramon Izquierdo-Cordova, Walterio Mayol-Cuevas

Increasing number of filters in deeper layers when feature maps are decreased is a widely adopted pattern in convolutional network design.

Agile Reactive Navigation for A Non-Holonomic Mobile Robot Using A Pixel Processor Array

no code implementations27 Sep 2020 Yanan Liu, Laurie Bose, Colin Greatwood, Jianing Chen, Rui Fan, Thomas Richardson, Stephen J. Carey, Piotr Dudek, Walterio Mayol-Cuevas

Experimental results demonstrate that the algorithm's ability to enable a ground vehicle to navigate at an average speed of 2. 20 m/s for passing through multiple gates and 3. 88 m/s for a 'slalom' task in an environment featuring significant visual clutter.

Navigate

Geometric Affordance Perception: Leveraging Deep 3D Saliency With the Interaction Tensor

1 code implementation7 Jul 2020 Eduardo Ruiz, Walterio Mayol-Cuevas

Agents that need to act on their surroundings can significantly benefit from the perception of their interaction possibilities or affordances.

Fully Embedding Fast Convolutional Networks on Pixel Processor Arrays

no code implementations ECCV 2020 Laurie Bose, Jianing Chen, Stephen J. Carey, Piotr Dudek, Walterio Mayol-Cuevas

This is in contrast to previous works that only use a sensor-level processing to sequentially compute image convolutions, and must transfer data to an external digital processor to complete the computation.

General Classification

Action Modifiers: Learning from Adverbs in Instructional Videos

1 code implementation CVPR 2020 Hazel Doughty, Ivan Laptev, Walterio Mayol-Cuevas, Dima Damen

We present a method to learn a representation for adverbs from instructional videos using weak supervision from the accompanying narrations.

Video-Adverb Retrieval

A Camera That CNNs: Towards Embedded Neural Networks on Pixel Processor Arrays

no code implementations ICCV 2019 Laurie Bose, Jianing Chen, Stephen J. Carey, Piotr Dudek, Walterio Mayol-Cuevas

This allows images to be stored and manipulated directly at the point of light capture, rather than having to transfer images to external processing hardware.

Egocentric affordance detection with the one-shot geometry-driven Interaction Tensor

no code implementations13 Jun 2019 Eduardo Ruiz, Walterio Mayol-Cuevas

In this abstract we describe recent [4, 7] and latest work on the determination of affordances in visually perceived 3D scenes.

Affordance Detection

The Pros and Cons: Rank-aware Temporal Attention for Skill Determination in Long Videos

1 code implementation CVPR 2019 Hazel Doughty, Walterio Mayol-Cuevas, Dima Damen

In addition to attending to task relevant video parts, our proposed loss jointly trains two attention modules to separately attend to video parts which are indicative of higher (pros) and lower (cons) skill.

What can I do here? Leveraging Deep 3D saliency and geometry for fast and scalable multiple affordance detection

1 code implementation3 Dec 2018 Eduardo Ruiz, Walterio Mayol-Cuevas

This paper develops and evaluates a novel method that allows for the detection of affordances in a scalable and multiple-instance manner on visually recovered pointclouds.

Affordance Detection Multiple Affordance Detection +1

Visual Odometry for Pixel Processor Arrays

no code implementations ICCV 2017 Laurie Bose, Jianing Chen, Stephen J. Carey, Piotr Dudek, Walterio Mayol-Cuevas

We present an approach of estimating constrained motion of a novel Cellular Processor Array (CPA) camera, on which each pixel is capable of limited processing and data storage allowing for fast low power parallel computation to be carried out directly on the focal-plane of the device.

Edge Detection Translation +1

Towards CNN map representation and compression for camera relocalisation

no code implementations15 Sep 2017 Luis Contreras, Walterio Mayol-Cuevas

This paper presents a study on the use of Convolutional Neural Networks for camera relocalisation and its application to map compression.

Camera Relocalization

Geometric Affordances from a Single Example via the Interaction Tensor

1 code implementation30 Mar 2017 Eduardo Ruiz, Walterio Mayol-Cuevas

This paper develops and evaluates a new tensor field representation to express the geometric affordance of one object over another.

Who's Better? Who's Best? Pairwise Deep Ranking for Skill Determination

no code implementations CVPR 2018 Hazel Doughty, Dima Damen, Walterio Mayol-Cuevas

We present a method for assessing skill from video, applicable to a variety of tasks, ranging from surgery to drawing and rolling pizza dough.

Trespassing the Boundaries: Labeling Temporal Bounds for Object Interactions in Egocentric Video

no code implementations ICCV 2017 Davide Moltisanti, Michael Wray, Walterio Mayol-Cuevas, Dima Damen

Manual annotations of temporal bounds for object interactions (i. e. start and end times) are typical training input to recognition, localization and detection algorithms.

Object

Towards CNN Map Compression for camera relocalisation

no code implementations2 Mar 2017 Luis Contreras, Walterio Mayol-Cuevas

We use a CNN map representation and introduce the notion of CNN map compression by using a smaller CNN architecture.

Camera Relocalization Position

SEMBED: Semantic Embedding of Egocentric Action Videos

no code implementations28 Jul 2016 Michael Wray, Davide Moltisanti, Walterio Mayol-Cuevas, Dima Damen

We present SEMBED, an approach for embedding an egocentric object interaction video in a semantic-visual graph to estimate the probability distribution over its potential semantic labels.

General Classification Object

You-Do, I-Learn: Unsupervised Multi-User egocentric Approach Towards Video-Based Guidance

no code implementations16 Oct 2015 Dima Damen, Teesid Leelasawassuk, Walterio Mayol-Cuevas

This paper presents an unsupervised approach towards automatically extracting video-based guidance on object usage, from egocentric video and wearable gaze tracking, collected from multiple users while performing tasks.

Object

Cannot find the paper you are looking for? You can Submit a new open access paper.