Search Results for author: Rameswar Panda

Found 44 papers, 18 papers with code

Contrast and Mix: Temporal Contrastive Video Domain Adaptation with Background Mixing

no code implementations NeurIPS 2021 Aadarsh Sahoo, Rutav Shah, Rameswar Panda, Kate Saenko, Abir Das

Unsupervised domain adaptation which aims to adapt models trained on a labeled source domain to a completely unlabeled target domain has attracted much attention in recent years.

Contrastive Learning Unsupervised Domain Adaptation +1

Selective Regression Under Fairness Criteria

no code implementations28 Oct 2021 Abhin Shah, Yuheng Bu, Joshua Ka-Wing Lee, Subhro Das, Rameswar Panda, Prasanna Sattigeri, Gregory W. Wornell

Selective regression allows abstention from prediction if the confidence to make an accurate prediction is not sufficient.

Fairness

Dynamic Network Quantization for Efficient Video Inference

1 code implementation ICCV 2021 Ximeng Sun, Rameswar Panda, Chun-Fu Chen, Aude Oliva, Rogerio Feris, Kate Saenko

Deep convolutional networks have recently achieved great success in video recognition, yet their practical realization remains a challenge due to the large amount of computational resources required to achieve robust recognition.

Frame Quantization +1

Can An Image Classifier Suffice For Action Recognition?

1 code implementation ICLR 2022 Quanfu Fan, Chun-Fu, Chen, Rameswar Panda

We explore a new perspective on video understanding by casting the video recognition problem as an image recognition task.

Action Recognition Image Classification +2

IA-RED$^2$: Interpretability-Aware Redundancy Reduction for Vision Transformers

no code implementations NeurIPS 2021 Bowen Pan, Rameswar Panda, Yifan Jiang, Zhangyang Wang, Rogerio Feris, Aude Oliva

The self-attention-based model, transformer, is recently becoming the leading backbone in the field of computer vision.

Dynamic Distillation Network for Cross-Domain Few-Shot Recognition with Unlabeled Data

1 code implementation NeurIPS 2021 Ashraful Islam, Chun-Fu Chen, Rameswar Panda, Leonid Karlinsky, Rogerio Feris, Richard J. Radke

As the base dataset and unlabeled dataset are from different domains, projecting the target images in the class-domain of the base dataset with a fixed pretrained model might be sub-optimal.

cross-domain few-shot learning

RegionViT: Regional-to-Local Attention for Vision Transformers

2 code implementations ICLR 2022 Chun-Fu Chen, Rameswar Panda, Quanfu Fan

The regional-to-local attention includes two steps: first, the regional self-attention extract global information among all regional tokens and then the local self-attention exchanges the information among one regional token and the associated local tokens via self-attention.

Action Recognition Image Classification +2

AdaMML: Adaptive Multi-Modal Learning for Efficient Video Recognition

1 code implementation ICCV 2021 Rameswar Panda, Chun-Fu Chen, Quanfu Fan, Ximeng Sun, Kate Saenko, Aude Oliva, Rogerio Feris

Specifically, given a video segment, a multi-modal policy network is used to decide what modalities should be used for processing by the recognition model, with the goal of improving both accuracy and efficiency.

Video Recognition

Multimodal Clustering Networks for Self-supervised Learning from Unlabeled Videos

1 code implementation ICCV 2021 Brian Chen, Andrew Rouditchenko, Kevin Duarte, Hilde Kuehne, Samuel Thomas, Angie Boggust, Rameswar Panda, Brian Kingsbury, Rogerio Feris, David Harwath, James Glass, Michael Picheny, Shih-Fu Chang

Multimodal self-supervised learning is getting more and more attention as it allows not only to train large networks without human supervision but also to search and retrieve data across various modalities.

Contrastive Learning Self-Supervised Learning +3

CrossViT: Cross-Attention Multi-Scale Vision Transformer for Image Classification

6 code implementations ICCV 2021 Chun-Fu Chen, Quanfu Fan, Rameswar Panda

To this end, we propose a dual-branch transformer to combine image patches (i. e., tokens in a transformer) of different sizes to produce stronger image features.

Classification General Classification +1

Improved Techniques for Quantizing Deep Networks with Adaptive Bit-Widths

no code implementations2 Mar 2021 Ximeng Sun, Rameswar Panda, Chun-Fu Chen, Naigang Wang, Bowen Pan, Kailash Gopalakrishnan, Aude Oliva, Rogerio Feris, Kate Saenko

Second, to effectively transfer knowledge, we develop a dynamic block swapping method by randomly replacing the blocks in the lower-precision student network with the corresponding blocks in the higher-precision teacher network.

Image Classification Quantization +2

VA-RED$^2$: Video Adaptive Redundancy Reduction

no code implementations ICLR 2021 Bowen Pan, Rameswar Panda, Camilo Fosco, Chung-Ching Lin, Alex Andonian, Yue Meng, Kate Saenko, Aude Oliva, Rogerio Feris

An inherent property of real-world videos is the high correlation of information across frames which can translate into redundancy in either temporal or spatial feature maps of the models, or both.

Semi-Supervised Action Recognition with Temporal Contrastive Learning

1 code implementation CVPR 2021 Ankit Singh, Omprakash Chakraborty, Ashutosh Varshney, Rameswar Panda, Rogerio Feris, Kate Saenko, Abir Das

We approach this problem by learning a two-pathway temporal contrastive model using unlabeled videos at two different speeds leveraging the fact that changing video speed does not change an action.

Action Recognition Contrastive Learning

A Maximal Correlation Approach to Imposing Fairness in Machine Learning

no code implementations30 Dec 2020 Joshua Lee, Yuheng Bu, Prasanna Sattigeri, Rameswar Panda, Gregory Wornell, Leonid Karlinsky, Rogerio Feris

As machine learning algorithms grow in popularity and diversify to many industries, ethical and legal concerns regarding their fairness have become increasingly relevant.

Fairness

Select, Label, and Mix: Learning Discriminative Invariant Feature Representations for Partial Domain Adaptation

no code implementations6 Dec 2020 Aadarsh Sahoo, Rameswar Panda, Rogerio Feris, Kate Saenko, Abir Das

Partial domain adaptation which assumes that the unknown target label space is a subset of the source label space has attracted much attention in computer vision.

Partial Domain Adaptation

Deep Analysis of CNN-based Spatio-temporal Representations for Action Recognition

1 code implementation CVPR 2021 Chun-Fu Chen, Rameswar Panda, Kandan Ramakrishnan, Rogerio Feris, John Cohn, Aude Oliva, Quanfu Fan

In recent years, a number of approaches based on 2D or 3D convolutional neural networks (CNN) have emerged for video action recognition, achieving state-of-the-art results on several large-scale benchmark datasets.

Action Recognition

Measurement-driven Security Analysis of Imperceptible Impersonation Attacks

no code implementations26 Aug 2020 Shasha Li, Karim Khalil, Rameswar Panda, Chengyu Song, Srikanth V. Krishnamurthy, Amit K. Roy-Chowdhury, Ananthram Swami

The emergence of Internet of Things (IoT) brings about new security challenges at the intersection of cyber and physical spaces.

Face Recognition

Adversarial Knowledge Transfer from Unlabeled Data

1 code implementation13 Aug 2020 Akash Gupta, Rameswar Panda, Sujoy Paul, Jianming Zhang, Amit K. Roy-Chowdhury

While machine learning approaches to visual recognition offer great promise, most of the existing methods rely heavily on the availability of large quantities of labeled training data.

Transfer Learning

Mitigating Dataset Imbalance via Joint Generation and Classification

1 code implementation12 Aug 2020 Aadarsh Sahoo, Ankit Singh, Rameswar Panda, Rogerio Feris, Abir Das

In this work we address these questions from the perspective of dataset imbalance resulting out of severe under-representation of annotated training data for certain classes and its effect on both deep classification and generation methods.

Classification General Classification

AR-Net: Adaptive Frame Resolution for Efficient Action Recognition

1 code implementation ECCV 2020 Yue Meng, Chung-Ching Lin, Rameswar Panda, Prasanna Sattigeri, Leonid Karlinsky, Aude Oliva, Kate Saenko, Rogerio Feris

Specifically, given a video frame, a policy network is used to decide what input resolution should be used for processing by the action recognition model, with the goal of improving both accuracy and efficiency.

Action Recognition Frame

NASTransfer: Analyzing Architecture Transferability in Large Scale Neural Architecture Search

no code implementations23 Jun 2020 Rameswar Panda, Michele Merler, Mayoore Jaiswal, Hui Wu, Kandan Ramakrishnan, Ulrich Finkler, Chun-Fu Chen, Minsik Cho, David Kung, Rogerio Feris, Bishwaranjan Bhattacharjee

The typical way of conducting large scale NAS is to search for an architectural building block on a small dataset (either using a proxy set from the large dataset or a completely different small scale dataset) and then transfer the block to a larger dataset.

Neural Architecture Search

Non-Adversarial Video Synthesis with Learned Priors

1 code implementation CVPR 2020 Abhishek Aich, Akash Gupta, Rameswar Panda, Rakib Hyder, M. Salman Asif, Amit K. Roy-Chowdhury

Different from these methods, we focus on the problem of generating videos from latent noise vectors, without any reference input frames.

Frame

Estimating Skin Tone and Effects on Classification Performance in Dermatology Datasets

no code implementations29 Oct 2019 Newton M. Kinyanjui, Timothy Odonga, Celia Cintas, Noel C. F. Codella, Rameswar Panda, Prasanna Sattigeri, Kush R. Varshney

We find that the majority of the data in the the two datasets have ITA values between 34. 5{\deg} and 48{\deg}, which are associated with lighter skin, and is consistent with under-representation of darker skinned populations in these datasets.

General Classification Skin Cancer Classification

Exploiting Global Camera Network Constraints for Unsupervised Video Person Re-identification

no code implementations27 Aug 2019 Xueping Wang, Rameswar Panda, Min Liu, Yaonan Wang, Amit K. Roy-Chowdhury

Additionally, a cross-view matching strategy followed by global camera network constraints is proposed to explore the matching relationships across the entire camera network.

Graph Matching Metric Learning +2

Webly Supervised Joint Embedding for Cross-Modal lmage-Text Retrieval

no code implementations Proceedings of the 26th ACM international conference on Multimedia·October 2018 2018 Niluthpol Chowdhury Mithun, Rameswar Panda, Vagelis Papalexakis, Amit K. Roy-Chowdhury

Inspired by the recent success of web-supervised learning in deep neural networks, we capitalize on readily-available web images with noisy annotations to learn robust image-text joint representation.

Cross-Modal Retrieval

Webly Supervised Joint Embedding for Cross-Modal Image-Text Retrieval

no code implementations23 Aug 2018 Niluthpol Chowdhury Mithun, Rameswar Panda, Evangelos E. Papalexakis, Amit K. Roy-Chowdhury

Inspired by the recent success of webly supervised learning in deep neural networks, we capitalize on readily-available web images with noisy annotations to learn robust image-text joint representation.

Cross-Modal Retrieval

Contemplating Visual Emotions: Understanding and Overcoming Dataset Bias

no code implementations ECCV 2018 Rameswar Panda, Jianming Zhang, Haoxiang Li, Joon-Young Lee, Xin Lu, Amit K. Roy-Chowdhury

While machine learning approaches to visual emotion recognition offer great promise, current methods consider training and testing models on small scale datasets covering limited visual emotion concepts.

Emotion Recognition

FFNet: Video Fast-Forwarding via Reinforcement Learning

1 code implementation CVPR 2018 Shuyue Lan, Rameswar Panda, Qi Zhu, Amit K. Roy-Chowdhury

The first group is supported by video summarization techniques, which require processing of the entire video to select an important subset for showing to users.

reinforcement-learning Video Summarization

Weakly Supervised Summarization of Web Videos

no code implementations ICCV 2017 Rameswar Panda, Abir Das, Ziyan Wu, Jan Ernst, Amit K. Roy-Chowdhury

Casting the problem as a weakly supervised learning problem, we propose a flexible deep 3D CNN architecture to learn the notion of importance using only video-level annotation, and without any human-crafted training data.

Unsupervised Adaptive Re-identification in Open World Dynamic Camera Networks

no code implementations CVPR 2017 Rameswar Panda, Amran Bhuiyan, Vittorio Murino, Amit K. Roy-Chowdhury

Most approaches have neglected the dynamic and open world nature of the re-identification problem, where a new camera may be temporarily inserted into an existing system to get additional information.

Person Re-Identification

Collaborative Summarization of Topic-Related Videos

no code implementations CVPR 2017 Rameswar Panda, Amit K. Roy-Chowdhury

Large collections of videos are grouped into clusters by a topic keyword, such as Eiffel Tower or Surfing, with many important visual concepts repeating across them.

Information Retrieval

Multi-View Surveillance Video Summarization via Joint Embedding and Sparse Optimization

no code implementations9 Jun 2017 Rameswar Panda, Amit K. Roy-Chowdhury

In this paper, with the aim of summarizing multi-view videos, we introduce a novel unsupervised framework via joint embedding and sparse representative selection.

Video Summarization

Diversity-aware Multi-Video Summarization

no code implementations9 Jun 2017 Rameswar Panda, Niluthpol Chowdhury Mithun, Amit K. Roy-Chowdhury

Most video summarization approaches have focused on extracting a summary from a single video; we propose an unsupervised framework for summarizing a collection of videos.

Video Summarization

Video Summarization in a Multi-View Camera Network

no code implementations1 Aug 2016 Rameswar Panda, Abir Das, Amit K. Roy-Chowdhury

While most existing video summarization approaches aim to extract an informative summary of a single video, we propose a novel framework for summarizing multi-view videos by exploiting both intra- and inter-view content correlations in a joint embedding space.

Video Summarization

Continuous Adaptation of Multi-Camera Person Identification Models through Sparse Non-redundant Representative Selection

no code implementations1 Jul 2016 Abir Das, Rameswar Panda, Amit K. Roy-Chowdhury

We demonstrate the effectiveness of our approach on multi-camera person re-identification datasets, to demonstrate the feasibility of learning online classification models in multi-camera big data applications.

Person Identification Person Re-Identification

Cannot find the paper you are looking for? You can Submit a new open access paper.