Search Results for author: Srijan Das

Found 26 papers, 14 papers with code

BAMM: Bidirectional Autoregressive Motion Model

1 code implementation • 28 Mar 2024 • Ekkasit Pinyoanuntapong, Muhammad Usama Saleem, Pu Wang, Minwoo Lee, Srijan Das, Chen Chen

To address these challenges, we propose Bidirectional Autoregressive Motion Model (BAMM), a novel text-to-motion generation framework.

Denoising

Paper
Code

Analysis and Detection of Multilingual Hate Speech Using Transformer Based Deep Learning

no code implementations • 19 Jan 2024 • Arijit Das, Somashree Nandy, Rupam Saha, Srijan Das, Diganta Saha

In this work, the proposed method is using transformer-based model to detect hate speech in social media, like twitter, Facebook, WhatsApp, Instagram, etc.

Hate Speech Detection

Paper
Add Code

SI-MIL: Taming Deep MIL for Self-Interpretability in Gigapixel Histopathology

1 code implementation • 22 Dec 2023 • Saarthak Kapse, Pushpak Pati, Srijan Das, Jingwei Zhang, Chao Chen, Maria Vakalopoulou, Joel Saltz, Dimitris Samaras, Rajarsi R. Gupta, Prateek Prasanna

Introducing interpretability and reasoning into Multiple Instance Learning (MIL) methods for Whole Slide Image (WSI) analysis is challenging, given the complexity of gigapixel slides.

Multiple Instance Learning

Paper
Code

Multiview Aerial Visual Recognition (MAVREC): Can Multi-view Improve Aerial Visual Perception?

no code implementations • 7 Dec 2023 • Aritra Dutta, Srijan Das, Jacob Nielsen, Rajatsubhra Chakraborty, Mubarak Shah

Despite the commercial abundance of UAVs, aerial data acquisition remains challenging, and the existing Asia and North America-centric open-source UAV datasets are small-scale or low-resolution and lack diversity in scene contextuality.

Benchmarking object-detection +2

Paper
Add Code

Just Add $π$! Pose Induced Video Transformers for Understanding Activities of Daily Living

1 code implementation • 30 Nov 2023 • Dominick Reilly, Srijan Das

To facilitate the adoption of video transformers for ADL, we hypothesize that the augmentation of RGB with human pose information, known for its sensitivity to fine-grained motion and multiple viewpoints, is essential.

Ranked #1 on Action Classification on Toyota Smarthome dataset (using extra training data)

Action Classification Action Recognition

Paper
Code

Limited Data, Unlimited Potential: A Study on ViTs Augmented by Masked Autoencoders

1 code implementation • 31 Oct 2023 • Srijan Das, Tanmay Jain, Dominick Reilly, Pranav Balaji, Soumyajit Karmakar, Shyam Marjit, Xiang Li, Abhijit Das, Michael S. Ryoo

We explore the appropriate SSL tasks that can be optimized alongside the primary task, the training schemes for these tasks, and the data scale at which they can be most effective.

DeepFake Detection Face Swapping +1

Paper
Code

Attention De-sparsification Matters: Inducing Diversity in Digital Pathology Representation Learning

no code implementations • 12 Sep 2023 • Saarthak Kapse, Srijan Das, Jingwei Zhang, Rajarsi R. Gupta, Joel Saltz, Dimitris Samaras, Prateek Prasanna

We propose DiRL, a Diversity-inducing Representation Learning technique for histopathology imaging.

Cell Segmentation Representation Learning +1

Paper
Add Code

Attending Generalizability in Course of Deep Fake Detection by Exploring Multi-task Learning

no code implementations • 25 Aug 2023 • Pranav Balaji, Abhijit Das, Srijan Das, Antitza Dantcheva

This work explores various ways of exploring multi-task learning (MTL) techniques aimed at classifying videos as original or manipulated in cross-manipulation scenario to attend generalizability in deep fake scenario.

Multi-Task Learning

Paper
Add Code

Seeing the Pose in the Pixels: Learning Pose-Aware Representations in Vision Transformers

1 code implementation • 15 Jun 2023 • Dominick Reilly, Aman Chadha, Srijan Das

Both PAAT and PAAB surpass their respective backbone Transformers by up to 9. 8% in real-world action recognition and 21. 8% in multi-view robotic video alignment.

Action Classification Action Recognition +4

Paper
Code

Video + CLIP Baseline for Ego4D Long-term Action Anticipation

1 code implementation • 1 Jul 2022 • Srijan Das, Michael S. Ryoo

The CLIP embedding provides fine-grained understanding of objects relevant for an action whereas the slowfast network is responsible for modeling temporal information within a video clip of few frames.

Action Anticipation Long Term Action Anticipation

Paper
Code

Learning Viewpoint-Agnostic Visual Representations by Recovering Tokens in 3D Space

1 code implementation • 23 Jun 2022 • Jinghuan Shang, Srijan Das, Michael S. Ryoo

To this end, we propose a 3D Token Representation Layer (3DTRL) that estimates the 3D positional information of the visual tokens and leverages it for learning viewpoint-agnostic representations.

Action Recognition Image Classification +1

Paper
Code

Does Self-supervised Learning Really Improve Reinforcement Learning from Pixels?

2 code implementations • 10 Jun 2022 • Xiang Li, Jinghuan Shang, Srijan Das, Michael S. Ryoo

We investigate whether self-supervised learning (SSL) can improve online reinforcement learning (RL) from pixels.

Image Augmentation reinforcement-learning +2

Paper
Code

CD-Net: Histopathology Representation Learning using Pyramidal Context-Detail Network

1 code implementation • 28 Mar 2022 • Saarthak Kapse, Srijan Das, Prateek Prasanna

To jointly leverage complementary information from multiple resolutions, we present a novel transformer based Pyramidal Context-Detail Network (CD-Net).

Representation Learning

Paper
Code

MS-TCT: Multi-Scale Temporal ConvTransformer for Action Detection

1 code implementation • CVPR 2022 • Rui Dai, Srijan Das, Kumara Kahatapitiya, Michael S. Ryoo, Francois Bremond

Action detection is an essential and challenging task, especially for densely labelled datasets of untrimmed videos.

Ranked #2 on Action Detection on TSU

Action Detection Temporal Action Localization

Paper
Code

ViewCLR: Learning Self-supervised Video Representation for Unseen Viewpoints

no code implementations • 7 Dec 2021 • Srijan Das, Michael S. Ryoo

Learning self-supervised video representation predominantly focuses on discriminating instances generated from simple data augmentation schemes.

Data Augmentation

Paper
Add Code

Cross-modal Manifold Cutmix for Self-supervised Video Representation Learning

no code implementations • 7 Dec 2021 • Srijan Das, Michael S. Ryoo

To this end, we propose Cross-Modal Manifold Cutmix (CMMC) that inserts a video tesseract into another video tesseract in the feature space across two different modalities.

Action Recognition Representation Learning +3

Paper
Add Code

CTRN: Class-Temporal Relational Network for Action Detection

no code implementations • 26 Oct 2021 • Rui Dai, Srijan Das, Francois Bremond

Action detection is an essential and challenging task, especially for densely labelled datasets of untrimmed videos.

Ranked #2 on Action Detection on Multi-THUMOS

Action Detection

Paper
Add Code

Vi-MIX FOR SELF-SUPERVISED VIDEO REPRESENTATION

no code implementations • 29 Sep 2021 • Srijan Das, Michael S Ryoo

We find that our video mixing strategy: Vi-Mix, i. e. preliminary mixing of videos followed by CMMC across different modalities in a video, improves the qual- ity of learned video representations.

Action Recognition Representation Learning +3

Paper
Add Code

Weakly-supervised Joint Anomaly Detection and Classification

no code implementations • 20 Aug 2021 • Snehashis Majhi, Srijan Das, Francois Bremond, Ratnakar Dash, Pankaj Kumar Sa

Thinking of a fully automatized surveillance system, which is capable of both detecting and classifying the anomalies that need immediate actions, a joint anomaly detection and classification method is a pressing need.

Anomaly Detection Classification +1

Paper
Add Code

Learning an Augmented RGB Representation with Cross-Modal Knowledge Distillation for Action Detection

no code implementations • ICCV 2021 • Rui Dai, Srijan Das, Francois Bremond

On the other hand, sequence-level distillation encourages the student to learn the temporal knowledge from the teacher, which consists of transferring the Global Contextual Relations and the Action Boundary Saliency.

Action Detection Knowledge Distillation +1

Paper
Add Code

VPN++: Rethinking Video-Pose embeddings for understanding Activities of Daily Living

1 code implementation • 17 May 2021 • Srijan Das, Rui Dai, Di Yang, Francois Bremond

But the cost of computing 3D poses from RGB stream is high in the absence of appropriate sensors.

Ranked #9 on Action Recognition on NTU RGB+D 120 (using extra training data)

Action Classification Action Recognition +1

Paper
Code

PDAN: Pyramid Dilated Attention Network for Action Detection

1 code implementation • 5 Jan 2021 • Rui Dai, Srijan Das, Luca Minciullo, Lorenzo Garattoni, Gianpiero Francesca, Francois Bremond

Previous action detection methods fail in selecting the key temporal information in long videos.

Ranked #1 on Action Detection on TSU

Action Detection Multi-Label Classification +1

Paper
Code

Toyota Smarthome Untrimmed: Real-World Untrimmed Videos for Activity Detection

1 code implementation • 28 Oct 2020 • Rui Dai, Srijan Das, Saurav Sharma, Luca Minciullo, Lorenzo Garattoni, Francois Bremond, Gianpiero Francesca

Therefore, we propose a new baseline method for activity detection to tackle the novel challenges provided by our dataset.

Action Detection Activity Detection

Paper
Code

VPN: Learning Video-Pose Embedding for Activities of Daily Living

1 code implementation • ECCV 2020 • Srijan Das, Saurav Sharma, Rui Dai, Francois Bremond, Monique Thonnat

The 2 key components of this VPN are a spatial embedding and an attention network.

Ranked #6 on Action Classification on Toyota Smarthome dataset (using extra training data)

Action Classification Action Recognition +2

Paper
Code

Toyota Smarthome: Real-World Activities of Daily Living

no code implementations • ICCV 2019 • Srijan Das, Rui Dai, Michal Koperski, Luca Minciullo, Lorenzo Garattoni, Francois Bremond, Gianpiero Francesca

As recent activity recognition approaches fail to address the challenges posed by Toyota Smarthome, we present a novel activity recognition method with attention mechanism.

Ranked #7 on Action Classification on Toyota Smarthome dataset (using extra training data)

16k Action Classification +1

Paper
Add Code

Deep-Temporal LSTM for Daily Living Action Recognition

no code implementations • 1 Feb 2018 • Srijan Das, Michal Koperski, Francois Bremond, Gianpiero Francesca

In this paper, we propose to improve the traditional use of RNNs by employing a many to many model for video classification.

Action Recognition General Classification +3

Paper
Add Code

Cannot find the paper you are looking for? You can Submit a new open access paper.