Search Results for author: Basura Fernando

Found 62 papers, 13 papers with code

CausalChaos! Dataset for Comprehensive Causal Action Question Answering Over Longer Causal Chains Grounded in Dynamic Visual Scenes

no code implementations • 1 Apr 2024 • Ting En Lam, Yuhan Chen, Elston Tan, Eric Peh, Ruirui Chen, Paritosh Parmar, Basura Fernando

We will release our dataset, codes, and models to help future efforts in this domain.

Question Answering Video Question Answering

Paper
Add Code

Zero Shot Open-ended Video Inference

no code implementations • 23 Jan 2024 • Ee Yeo Keat, Zhang Hao, Alexander Matyasko, Basura Fernando

Zero-shot open-ended inference on untrimmed videos poses a significant challenge, especially when no annotated data is utilized to navigate the inference direction.

Action Recognition Language Modelling +2

Paper
Add Code

Learning to Visually Connect Actions and their Effects

no code implementations • 19 Jan 2024 • Eric Peh, Paritosh Parmar, Basura Fernando

We propose different CATE-based task formulations, such as action selection and action specification, where video understanding models connect actions and effects at semantic and fine-grained levels.

Video Understanding

Paper
Add Code

Motion Flow Matching for Human Motion Synthesis and Editing

no code implementations • 14 Dec 2023 • Vincent Tao Hu, Wenzhe Yin, Pingchuan Ma, Yunlu Chen, Basura Fernando, Yuki M Asano, Efstratios Gavves, Pascal Mettes, Bjorn Ommer, Cees G. M. Snoek

In this paper, we propose \emph{Motion Flow Matching}, a novel generative model designed for human motion generation featuring efficient sampling and effectiveness in motion editing applications.

Motion Interpolation motion prediction +1

Paper
Add Code

Semi-supervised multimodal coreference resolution in image narrations

1 code implementation • 20 Oct 2023 • Arushi Goel, Basura Fernando, Frank Keller, Hakan Bilen

In this paper, we study multimodal coreference resolution, specifically where a longer descriptive text, i. e., a narration is paired with an image.

coreference-resolution Descriptive

Paper
Code

ClipSitu: Effectively Leveraging CLIP for Conditional Predictions in Situation Recognition

1 code implementation • 2 Jul 2023 • Debaditya Roy, Dhruv Verma, Basura Fernando

Situation Recognition is the task of generating a structured summary of what is happening in an image using an activity verb and the semantic roles played by actors and objects.

Ranked #1 on Situation Recognition on imSitu

Grounded Situation Recognition

Paper
Code

Revealing the Illusion of Joint Multimodal Understanding in VideoQA Models

no code implementations • 15 Jun 2023 • Ishaan Singh Rawal, Shantanu Jaiswal, Basura Fernando, Cheston Tan

We evaluate models on CLAVI and find that all models achieve high performance on multimodal shortcut instances, but most of them have poor performance on the counterfactual instances that necessitate joint multimodal understanding.

Benchmarking counterfactual

Paper
Add Code

Modelling Spatio-Temporal Interactions for Compositional Action Recognition

no code implementations • 4 May 2023 • Ramanathan Rajendiran, Debaditya Roy, Basura Fernando

The final context-infused spatio-temporal interaction tokens are used for compositional action recognition.

Action Recognition Human-Object Interaction Detection +1

Paper
Add Code

A Region-Prompted Adapter Tuning for Visual Abductive Reasoning

no code implementations • 18 Mar 2023 • Hao Zhang, Yeo Keat Ee, Basura Fernando

Existing works highlight cues utilizing a specific prompt (e. g., colorful prompt).

Ranked #1 on Visual Abductive Reasoning on SHERLOCK

Visual Abductive Reasoning

Paper
Add Code

Energy-based Self-Training and Normalization for Unsupervised Domain Adaptation

no code implementations • ICCV 2023 • Samitha Herath, Basura Fernando, Ehsan Abbasnejad, Munawar Hayat, Shahram Khadivi, Mehrtash Harandi, Hamid Rezatofighi, Gholamreza Haffari

EBL can be used to improve the instance selection for a self-training task on the unlabelled target domain, and 2. alignment and normalizing energy scores can learn domain-invariant representations.

Unsupervised Domain Adaptation

Paper
Add Code

Who are you referring to? Coreference resolution in image narrations

no code implementations • ICCV 2023 • Arushi Goel, Basura Fernando, Frank Keller, Hakan Bilen

Coreference resolution aims to identify words and phrases which refer to same entity in a text, a core task in natural language processing.

coreference-resolution

Paper
Add Code

Interaction Region Visual Transformer for Egocentric Action Anticipation

no code implementations • 25 Nov 2022 • Debaditya Roy, Ramanathan Rajendiran, Basura Fernando

On the EK100 evaluation server, InAViT is the top-performing method on the public leaderboard (at the time of submission) where it outperforms the second-best model by 3. 3% on mean-top5 recall.

Ranked #1 on Action Anticipation on EGTEA

Action Anticipation Human-Object Interaction Detection +1

Paper
Add Code

Abductive Action Inference

no code implementations • 24 Oct 2022 • Clement Tan, Chai Kiat Yeo, Cheston Tan, Basura Fernando

In this paper, we introduce a novel research task known as "abductive action inference" which addresses the question of which actions were executed by a human to reach a specific state shown in a single snapshot.

Paper
Add Code

Predicting the Next Action by Modeling the Abstract Goal

no code implementations • 12 Sep 2022 • Debaditya Roy, Basura Fernando

It is through the submission of this paper that our method is currently the new state-of-the-art for action anticipation in EK55 and EGTEA Gaze+ https://competitions. codalab. org/competitions/20071#results Code available at https://github. com/debadityaroy/Abstract_Goal

Ranked #1 on Action Anticipation on EPIC-KITCHENS-55 (Seen test set (S1))

Action Anticipation

Paper
Add Code

Consistency Regularization for Domain Adaptation

1 code implementation • 23 Aug 2022 • Kian Boon Koh, Basura Fernando

Collection of real world annotations for training semantic segmentation models is an expensive process.

Self-Learning Semantic Segmentation +1

Paper
Code

3D Equivariant Graph Implicit Functions

no code implementations • 31 Mar 2022 • Yunlu Chen, Basura Fernando, Hakan Bilen, Matthias Nießner, Efstratios Gavves

In this work, we address two key limitations of such representations, in failing to capture local 3D geometric fine details, and to learn from and generalize to shapes with unseen 3D transformations.

Paper
Add Code

LocFormer: Enabling Transformers to Perform Temporal Moment Localization on Long Untrimmed Videos With a Feature Sampling Approach

no code implementations • 19 Dec 2021 • Cristian Rodriguez-Opazo, Edison Marrese-Taylor, Basura Fernando, Hiroya Takamura, Qi Wu

We propose LocFormer, a Transformer-based model for video grounding which operates at a constant memory footprint regardless of the video length, i. e. number of frames.

Inductive Bias Video Grounding

Paper
Add Code

TDAM: Top-Down Attention Module for Contextually Guided Feature Selection in CNNs

1 code implementation • 26 Nov 2021 • Shantanu Jaiswal, Basura Fernando, Cheston Tan

Attention modules for Convolutional Neural Networks (CNNs) are an effective method to enhance performance on multiple computer-vision tasks.

feature selection Image Classification +3

Paper
Code

Not All Relations are Equal: Mining Informative Labels for Scene Graph Generation

no code implementations • CVPR 2022 • Arushi Goel, Basura Fernando, Frank Keller, Hakan Bilen

Scene graph generation (SGG) aims to capture a wide variety of interactions between pairs of objects, which is essential for full scene understanding.

Graph Generation Informativeness +2

Paper
Add Code

Action Forecasting with Feature-wise Self-Attention

no code implementations • 19 Jul 2021 • Yan Bin Ng, Basura Fernando

A temporal recurrent encoder captures temporal information of input videos while a self-attention model is used to attend on relevant feature dimensions of the input space.

Paper
Add Code

Anticipating human actions by correlating past with the future with Jaccard similarity measures

no code implementations • CVPR 2021 • Basura Fernando, Samitha Herath

We propose a framework for early action recognition and anticipation by correlating past features with the future using three novel similarity measures called Jaccard vector similarity, Jaccard cross-correlation and Jaccard Frobenius inner product over covariances.

Action Anticipation Action Recognition

Paper
Add Code

A Log-likelihood Regularized KL Divergence for Video Prediction with A 3D Convolutional Variational Recurrent Network

no code implementations • 11 Dec 2020 • Haziq Razali, Basura Fernando

In this paper, we introduce a new variational model that extends the recurrent network in two ways for the task of video frame prediction.

Video Prediction

Paper
Add Code

FlowCaps: Optical Flow Estimation with Capsule Networks For Action Recognition

no code implementations • 8 Nov 2020 • Vinoj Jayasundara, Debaditya Roy, Basura Fernando

Capsule networks (CapsNets) have recently shown promise to excel in most computer vision tasks, especially pertaining to scene understanding.

Action Recognition Optical Flow Estimation +1

Paper
Add Code

DORi: Discovering Object Relationship for Moment Localization of a Natural-Language Query in Video

1 code implementation • 13 Oct 2020 • Cristian Rodriguez-Opazo, Edison Marrese-Taylor, Basura Fernando, Hongdong Li, Stephen Gould

This paper studies the task of temporal moment localization in a long untrimmed video using natural language query.

Sentence

Paper
Code

Inferring Temporal Compositions of Actions Using Probabilistic Automata

no code implementations • 28 Apr 2020 • Rodrigo Santa Cruz, Anoop Cherian, Basura Fernando, Dylan Campbell, Stephen Gould

This paper presents a framework to recognize temporal compositions of atomic actions in videos.

Action Recognition

Paper
Add Code

Forecasting future action sequences with attention: a new approach to weakly supervised action forecasting

no code implementations • 10 Dec 2019 • Yan Bin Ng, Basura Fernando

We extend our action sequence forecasting model to perform weakly supervised action forecasting on two challenging datasets, the Breakfast and the 50Salads.

Machine Translation

Paper
Add Code

Injecting Prior Knowledge into Image Caption Generation

no code implementations • 22 Nov 2019 • Arushi Goel, Basura Fernando, Thanh-Son Nguyen, Hakan Bilen

Automatically generating natural language descriptions from an image is a challenging problem in artificial intelligence that requires a good understanding of the visual and textual signals and the correlations between them.

Caption Generation Image Captioning

Paper
Add Code

Action Anticipation with RBF Kernelized Feature Mapping RNN

no code implementations • ECCV 2018 • Yuge Shi, Basura Fernando, Richard Hartley

We introduce a novel Recurrent Neural Network-based algorithm for future video feature generation and action anticipation called feature mapping RNN.

Action Anticipation

Paper
Add Code

Human Action Sequence Classification

no code implementations • 7 Oct 2019 • Yan Bin Ng, Basura Fernando

Furthermore, we use our model that is trained to output action sequences to solve downstream tasks; such as video captioning and action localization.

Action Classification Action Segmentation +5

Paper
Add Code

Deep Multiple Instance Learning with Gaussian Weighting

no code implementations • 25 Sep 2019 • Basura Fernando, Hakan Bilen

The instance representation is shared by both instance classification and weighting streams.

Classification Multiple Instance Learning +1

Paper
Add Code

Weakly Supervised Gaussian Networks for Action Detection

no code implementations • 16 Apr 2019 • Basura Fernando, Cheston Tan Yin Chet, Hakan Bilen

Detecting temporal extents of human actions in videos is a challenging computer vision problem that requires detailed manual supervision including frame-level labels.

Action Detection Temporal Action Localization

Paper
Add Code

VIENA2: A Driving Anticipation Dataset

no code implementations • 22 Oct 2018 • Mohammad Sadegh Aliakbarian, Fatemeh Sadat Saleh, Mathieu Salzmann, Basura Fernando, Lars Petersson, Lars Andersson

Action anticipation is critical in scenarios where one needs to react before the action is finalized.

Action Anticipation

Paper
Add Code

Face Super-resolution Guided by Facial Component Heatmaps

no code implementations • ECCV 2018 • Xin Yu, Basura Fernando, Bernard Ghanem, Fatih Porikli, Richard Hartley

State-of-the-art face super-resolution methods use deep convolutional neural networks to learn a mapping between low-resolution (LR) facial patterns and their corresponding high-resolution (HR) counterparts by exploring local information.

Face Hallucination Hallucination +1

Paper
Add Code

Action Anticipation By Predicting Future Dynamic Images

no code implementations • 1 Aug 2018 • Cristian Rodriguez, Basura Fernando, Hongdong Li

Human action-anticipation methods predict what is the future action by observing only a few portion of an action in progress.

Action Anticipation Autonomous Driving +1

Paper
Add Code

Super-Resolving Very Low-Resolution Face Images With Supplementary Attributes

no code implementations • CVPR 2018 • Xin Yu, Basura Fernando, Richard Hartley, Fatih Porikli

An LR input contains low-frequency facial components of its HR version while its residual face image defined as the difference between the HR ground-truth and interpolated LR images contains the missing high-frequency facial details.

Attribute Face Hallucination +2

Paper
Add Code

Neural Algebra of Classifiers

no code implementations • 26 Jan 2018 • Rodrigo Santa Cruz, Basura Fernando, Anoop Cherian, Stephen Gould

In this paper, we build on the compositionality principle and develop an "algebra" to compose classifiers for complex visual concepts.

Paper
Add Code

Discriminatively Learned Hierarchical Rank Pooling Networks

1 code implementation • 30 May 2017 • Basura Fernando, Stephen Gould

First, we present "discriminative rank pooling" in which the shared weights of our video representation and the parameters of the action classifiers are estimated jointly for a given training dataset of labelled vector sequences using a bilevel optimization formulation of the learning problem.

Activity Recognition Bilevel Optimization +1

Paper
Code

DeepPermNet: Visual Permutation Learning

no code implementations • CVPR 2017 • Rodrigo Santa Cruz, Basura Fernando, Anoop Cherian, Stephen Gould

Unrolling these iterations in a Sinkhorn network layer, we propose DeepPermNet, an end-to-end CNN model for this task.

Representation Learning Rolling Shutter Correction

Paper
Add Code

Generalized Rank Pooling for Activity Recognition

no code implementations • CVPR 2017 • Anoop Cherian, Basura Fernando, Mehrtash Harandi, Stephen Gould

Most popular deep models for action recognition split video sequences into short sub-sequences consisting of a few frames; frame-based features are then pooled for recognizing the activity.

Action Recognition Riemannian optimization +1

Paper
Add Code

Encouraging LSTMs to Anticipate Actions Very Early

1 code implementation • ICCV 2017 • Mohammad Sadegh Aliakbarian, Fatemeh Sadat Saleh, Mathieu Salzmann, Basura Fernando, Lars Petersson, Lars Andersson

In contrast to the widely studied problem of recognizing an action given a complete sequence, action anticipation aims to identify the action from only partially available videos.

Action Anticipation Autonomous Navigation

Paper
Code

Deep Learning for Automated Quality Assessment of Color Fundus Images in Diabetic Retinopathy Screening

no code implementations • 7 Mar 2017 • Sajib Kumar Saha, Basura Fernando, Jorge Cuadros, Di Xiao, Yogesan Kanagasingam

Three retinal image analysis experts were employed to categorize these images into Accept and Reject classes based on the precise definition of image quality in the context of DR. A deep learning framework was trained using 3428 images.

Image Quality Assessment

Paper
Add Code

Unsupervised Human Action Detection by Action Matching

no code implementations • 2 Dec 2016 • Basura Fernando, Sareh Shirazi, Stephen Gould

On the MPII Cooking dataset we detect action segments with a precision of 21. 6% and recall of 11. 7% over 946 long video pairs and over 5000 ground truth action segments.

Action Detection Activity Recognition

Paper
Add Code

Guided Open Vocabulary Image Captioning with Constrained Beam Search

1 code implementation • EMNLP 2017 • Peter Anderson, Basura Fernando, Mark Johnson, Stephen Gould

Existing image captioning models do not generalize well to out-of-domain images containing novel scenes or objects.

Image Captioning TAG +1

Paper
Code

Action Recognition with Dynamic Image Networks

3 code implementations • 2 Dec 2016 • Hakan Bilen, Basura Fernando, Efstratios Gavves, Andrea Vedaldi

This is a powerful idea because it allows to convert any video to an image so that existing CNN models pre-trained for the analysis of still images can be immediately extended to videos.

Action Recognition Optical Flow Estimation +1

181

Paper
Code

Self-Supervised Video Representation Learning With Odd-One-Out Networks

no code implementations • CVPR 2017 • Basura Fernando, Hakan Bilen, Efstratios Gavves, Stephen Gould

On action classification, our method obtains 60. 3\% on the UCF101 dataset using only UCF101 data for training which is approximately 10% better than current state-of-the-art self-supervised learning methods.

Ranked #47 on Self-Supervised Action Recognition on UCF101

Action Classification General Classification +5

Paper
Add Code

Deep Action- and Context-Aware Sequence Learning for Activity Recognition and Anticipation

no code implementations • 17 Nov 2016 • Mohammad Sadegh Aliakbarian, Fatemehsadat Saleh, Basura Fernando, Mathieu Salzmann, Lars Petersson, Lars Andersson

We outperform the state-of-the-art methods that, as us, rely only on RGB frames as input for both action recognition and anticipation.

Action Recognition Temporal Action Localization

Paper
Add Code

Generalized BackPropagation, Étude De Cas: Orthogonality

no code implementations • 17 Nov 2016 • Mehrtash Harandi, Basura Fernando

This paper introduces an extension of the backpropagation algorithm that enables us to have layers with constrained weights in a deep network.

Dimensionality Reduction Fine-Grained Image Classification +1

Paper
Add Code

SPICE: Semantic Propositional Image Caption Evaluation

11 code implementations • 29 Jul 2016 • Peter Anderson, Basura Fernando, Mark Johnson, Stephen Gould

There is considerable interest in the task of automatically generating image captions.

Image Captioning

1,061

Paper
Code

On Differentiating Parameterized Argmin and Argmax Problems with Application to Bi-level Optimization

no code implementations • 19 Jul 2016 • Stephen Gould, Basura Fernando, Anoop Cherian, Peter Anderson, Rodrigo Santa Cruz, Edison Guo

Some recent works in machine learning and computer vision involve the solution of a bi-level optimization problem.

BIG-bench Machine Learning

Paper
Add Code

Discriminative Hierarchical Rank Pooling for Activity Recognition

no code implementations • CVPR 2016 • Basura Fernando, Peter Anderson, Marcus Hutter, Stephen Gould

We present hierarchical rank pooling, a video sequence encoding method for activity recognition.

Action Recognition Temporal Action Localization

Paper
Add Code

Dynamic Image Networks for Action Recognition

1 code implementation • CVPR 2016 • Hakan Bilen, Basura Fernando, Efstratios Gavves, Andrea Vedaldi, Stephen Gould

We introduce the concept of dynamic image, a novel compact representation of videos useful for video analysis especially when convolutional neural networks (CNNs) are used.

Ranked #62 on Action Recognition on HMDB-51

Action Recognition Temporal Action Localization

181

Paper
Code

Rank Pooling for Action Recognition

1 code implementation • 6 Dec 2015 • Basura Fernando, Efstratios Gavves, Jose Oramas, Amir Ghodrati, Tinne Tuytelaars

We show how the parameters of a function that has been fit to the video data can serve as a robust new video representation.

Action Recognition Gesture Recognition +2

Paper
Code

Guiding the Long-Short Term Memory Model for Image Caption Generation

no code implementations • ICCV 2015 • Xu Jia, Efstratios Gavves, Basura Fernando, Tinne Tuytelaars

In this work we focus on the problem of image caption generation.

Caption Generation

Paper
Add Code

Learning to Rank Based on Subsequences

no code implementations • ICCV 2015 • Basura Fernando, Efstratios Gavves, Damien Muselet, Tinne Tuytelaars

We present a supervised learning to rank algorithm that effectively orders images by exploiting the structure in image sequences.

Learning-To-Rank

Paper
Add Code

MidRank: Learning to rank based on subsequences

no code implementations • 29 Nov 2015 • Basura Fernando, Efstratios Gavves, Damien Muselet, Tinne Tuytelaars

We present a supervised learning to rank algorithm that effectively orders images by exploiting the structure in image sequences.

Learning-To-Rank

Paper
Add Code

Guiding Long-Short Term Memory for Image Caption Generation

1 code implementation • 16 Sep 2015 • Xu Jia, Efstratios Gavves, Basura Fernando, Tinne Tuytelaars

In this work we focus on the problem of image caption generation.

Caption Generation

Paper
Code

Dataset Fingerprints: Exploring Image Collections Through Data Mining

no code implementations • CVPR 2015 • Konstantinos Rematas, Basura Fernando, Frank Dellaert, Tinne Tuytelaars

As the amount of visual data increases, so does the need for summarization tools that can be used to explore large image collections and to quickly get familiar with their content.

Paper
Add Code

Modeling Video Evolution for Action Recognition

no code implementations • CVPR 2015 • Basura Fernando, Efstratios Gavves, Jose Oramas M., Amir Ghodrati, Tinne Tuytelaars

We postulate that a function capable of ordering the frames of a video temporally (based on the appearance) captures well the evolution of the appearance within the video.

Action Recognition Skeleton Based Action Recognition +1

Paper
Add Code

Object Class Detection and Classification using Multi Scale Gradient and Corner Point based Shape Descriptors

no code implementations • 3 May 2015 • Basura Fernando, Sezer Karaoglu, Sajib Kumar Saha

This paper presents a novel multi scale gradient and a corner point based shape descriptors.

Classification General Classification +4

Paper
Add Code

Joint cross-domain classification and subspace learning for unsupervised adaptation

no code implementations • 17 Nov 2014 • Basura Fernando, Tatiana Tommasi, Tinne Tuytelaars

Domain adaptation aims at adapting the knowledge acquired on a source domain to a new different but related target domain.

Domain Adaptation domain classification +1

Paper
Add Code

Location Recognition Over Large Time Lags

no code implementations • 26 Sep 2014 • Basura Fernando, Tatiana Tommasi, Tinne Tuytelaars

Would it be possible to automatically associate ancient pictures to modern ones and create fancy cultural heritage city maps?

Domain Adaptation

Paper
Add Code

Subspace Alignment For Domain Adaptation

no code implementations • 18 Sep 2014 • Basura Fernando, Amaury Habrard, Marc Sebban, Tinne Tuytelaars

We present two approaches to determine the only hyper-parameter in our method corresponding to the size of the subspaces.

Domain Adaptation

Paper
Add Code

Cannot find the paper you are looking for? You can Submit a new open access paper.