Search Results for author: Srijan Das

Found 26 papers, 14 papers with code

BAMM: Bidirectional Autoregressive Motion Model

1 code implementation28 Mar 2024 Ekkasit Pinyoanuntapong, Muhammad Usama Saleem, Pu Wang, Minwoo Lee, Srijan Das, Chen Chen

To address these challenges, we propose Bidirectional Autoregressive Motion Model (BAMM), a novel text-to-motion generation framework.

Denoising

Analysis and Detection of Multilingual Hate Speech Using Transformer Based Deep Learning

no code implementations19 Jan 2024 Arijit Das, Somashree Nandy, Rupam Saha, Srijan Das, Diganta Saha

In this work, the proposed method is using transformer-based model to detect hate speech in social media, like twitter, Facebook, WhatsApp, Instagram, etc.

Hate Speech Detection

SI-MIL: Taming Deep MIL for Self-Interpretability in Gigapixel Histopathology

1 code implementation22 Dec 2023 Saarthak Kapse, Pushpak Pati, Srijan Das, Jingwei Zhang, Chao Chen, Maria Vakalopoulou, Joel Saltz, Dimitris Samaras, Rajarsi R. Gupta, Prateek Prasanna

Introducing interpretability and reasoning into Multiple Instance Learning (MIL) methods for Whole Slide Image (WSI) analysis is challenging, given the complexity of gigapixel slides.

Multiple Instance Learning

Multiview Aerial Visual Recognition (MAVREC): Can Multi-view Improve Aerial Visual Perception?

no code implementations7 Dec 2023 Aritra Dutta, Srijan Das, Jacob Nielsen, Rajatsubhra Chakraborty, Mubarak Shah

Despite the commercial abundance of UAVs, aerial data acquisition remains challenging, and the existing Asia and North America-centric open-source UAV datasets are small-scale or low-resolution and lack diversity in scene contextuality.

Benchmarking object-detection +2

Just Add $π$! Pose Induced Video Transformers for Understanding Activities of Daily Living

1 code implementation30 Nov 2023 Dominick Reilly, Srijan Das

To facilitate the adoption of video transformers for ADL, we hypothesize that the augmentation of RGB with human pose information, known for its sensitivity to fine-grained motion and multiple viewpoints, is essential.

 Ranked #1 on Action Classification on Toyota Smarthome dataset (using extra training data)

Action Classification Action Recognition

Limited Data, Unlimited Potential: A Study on ViTs Augmented by Masked Autoencoders

1 code implementation31 Oct 2023 Srijan Das, Tanmay Jain, Dominick Reilly, Pranav Balaji, Soumyajit Karmakar, Shyam Marjit, Xiang Li, Abhijit Das, Michael S. Ryoo

We explore the appropriate SSL tasks that can be optimized alongside the primary task, the training schemes for these tasks, and the data scale at which they can be most effective.

DeepFake Detection Face Swapping +1

Attending Generalizability in Course of Deep Fake Detection by Exploring Multi-task Learning

no code implementations25 Aug 2023 Pranav Balaji, Abhijit Das, Srijan Das, Antitza Dantcheva

This work explores various ways of exploring multi-task learning (MTL) techniques aimed at classifying videos as original or manipulated in cross-manipulation scenario to attend generalizability in deep fake scenario.

Multi-Task Learning

Seeing the Pose in the Pixels: Learning Pose-Aware Representations in Vision Transformers

1 code implementation15 Jun 2023 Dominick Reilly, Aman Chadha, Srijan Das

Both PAAT and PAAB surpass their respective backbone Transformers by up to 9. 8% in real-world action recognition and 21. 8% in multi-view robotic video alignment.

Action Classification Action Recognition +4

Video + CLIP Baseline for Ego4D Long-term Action Anticipation

1 code implementation1 Jul 2022 Srijan Das, Michael S. Ryoo

The CLIP embedding provides fine-grained understanding of objects relevant for an action whereas the slowfast network is responsible for modeling temporal information within a video clip of few frames.

Action Anticipation Long Term Action Anticipation

Learning Viewpoint-Agnostic Visual Representations by Recovering Tokens in 3D Space

1 code implementation23 Jun 2022 Jinghuan Shang, Srijan Das, Michael S. Ryoo

To this end, we propose a 3D Token Representation Layer (3DTRL) that estimates the 3D positional information of the visual tokens and leverages it for learning viewpoint-agnostic representations.

Action Recognition Image Classification +1

CD-Net: Histopathology Representation Learning using Pyramidal Context-Detail Network

1 code implementation28 Mar 2022 Saarthak Kapse, Srijan Das, Prateek Prasanna

To jointly leverage complementary information from multiple resolutions, we present a novel transformer based Pyramidal Context-Detail Network (CD-Net).

Representation Learning

ViewCLR: Learning Self-supervised Video Representation for Unseen Viewpoints

no code implementations7 Dec 2021 Srijan Das, Michael S. Ryoo

Learning self-supervised video representation predominantly focuses on discriminating instances generated from simple data augmentation schemes.

Data Augmentation

Cross-modal Manifold Cutmix for Self-supervised Video Representation Learning

no code implementations7 Dec 2021 Srijan Das, Michael S. Ryoo

To this end, we propose Cross-Modal Manifold Cutmix (CMMC) that inserts a video tesseract into another video tesseract in the feature space across two different modalities.

Action Recognition Representation Learning +3

CTRN: Class-Temporal Relational Network for Action Detection

no code implementations26 Oct 2021 Rui Dai, Srijan Das, Francois Bremond

Action detection is an essential and challenging task, especially for densely labelled datasets of untrimmed videos.

Action Detection

Vi-MIX FOR SELF-SUPERVISED VIDEO REPRESENTATION

no code implementations29 Sep 2021 Srijan Das, Michael S Ryoo

We find that our video mixing strategy: Vi-Mix, i. e. preliminary mixing of videos followed by CMMC across different modalities in a video, improves the qual- ity of learned video representations.

Action Recognition Representation Learning +3

Weakly-supervised Joint Anomaly Detection and Classification

no code implementations20 Aug 2021 Snehashis Majhi, Srijan Das, Francois Bremond, Ratnakar Dash, Pankaj Kumar Sa

Thinking of a fully automatized surveillance system, which is capable of both detecting and classifying the anomalies that need immediate actions, a joint anomaly detection and classification method is a pressing need.

Anomaly Detection Classification +1

Learning an Augmented RGB Representation with Cross-Modal Knowledge Distillation for Action Detection

no code implementations ICCV 2021 Rui Dai, Srijan Das, Francois Bremond

On the other hand, sequence-level distillation encourages the student to learn the temporal knowledge from the teacher, which consists of transferring the Global Contextual Relations and the Action Boundary Saliency.

Action Detection Knowledge Distillation +1

VPN++: Rethinking Video-Pose embeddings for understanding Activities of Daily Living

1 code implementation17 May 2021 Srijan Das, Rui Dai, Di Yang, Francois Bremond

But the cost of computing 3D poses from RGB stream is high in the absence of appropriate sensors.

Ranked #9 on Action Recognition on NTU RGB+D 120 (using extra training data)

Action Classification Action Recognition +1

Toyota Smarthome: Real-World Activities of Daily Living

no code implementations ICCV 2019 Srijan Das, Rui Dai, Michal Koperski, Luca Minciullo, Lorenzo Garattoni, Francois Bremond, Gianpiero Francesca

As recent activity recognition approaches fail to address the challenges posed by Toyota Smarthome, we present a novel activity recognition method with attention mechanism.

Ranked #7 on Action Classification on Toyota Smarthome dataset (using extra training data)

16k Action Classification +1

Deep-Temporal LSTM for Daily Living Action Recognition

no code implementations1 Feb 2018 Srijan Das, Michal Koperski, Francois Bremond, Gianpiero Francesca

In this paper, we propose to improve the traditional use of RNNs by employing a many to many model for video classification.

Action Recognition General Classification +3

Cannot find the paper you are looking for? You can Submit a new open access paper.