1 code implementation • 14 Apr 2024 • Jose M. Rojas Chaves, Subarna Tripathi
We propose a graph-based representation learning framework for video summarization.
1 code implementation • 6 Dec 2023 • Ivan Rodin, Antonino Furnari, Kyle Min, Subarna Tripathi, Giovanni Maria Farinella
We present Egocentric Action Scene Graphs (EASGs), a new representation for long-form understanding of egocentric videos.
no code implementations • 9 Jun 2023 • Alakh Desai, Tz-Ying Wu, Subarna Tripathi, Nuno Vasconcelos
Research in scene graph generation (SGG) usually considers two-stage models, that is, detecting a set of entities, followed by combining them and labeling all possible relationships.
1 code implementation • CVPR 2023 • Yi Li, Kyle Min, Subarna Tripathi, Nuno Vasconcelos
Do video-text transformers learn to model temporal relationships across frames?
Ranked #4 on Video Question Answering on AGQA 2.0 balanced (Average Accuracy metric)
1 code implementation • CVPR 2023 • Sayak Nag, Kyle Min, Subarna Tripathi, Amit K. Roy Chowdhury
The task of dynamic scene graph generation (SGG) from videos is complicated and challenging due to the inherent dynamics of a scene, temporal fluctuation of model predictions, and the long-tailed distribution of the visual relationships in addition to the already existing challenges in image-based SGG.
2 code implementations • 15 Jul 2022 • Kyle Min, Sourya Roy, Subarna Tripathi, Tanaya Guha, Somdeb Majumdar
Active speaker detection (ASD) in videos with multiple speakers is a challenging task as it requires learning effective audiovisual features and spatial-temporal correlations over long temporal windows.
Ranked #1 on Node Classification on AVA
1 code implementation • CVPR 2022 • Xiang Zhang, Yongwen Su, Subarna Tripathi, Zhuowen Tu
In this paper, we present TExt Spotting TRansformers (TESTR), a generic end-to-end text spotting framework using Transformers for text detection and recognition in the wild.
Ranked #6 on Text Spotting on ICDAR 2015
1 code implementation • CVPR 2022 • Shaowei Liu, Subarna Tripathi, Somdeb Majumdar, Xiaolong Wang
To tackle this task, we first provide an automatic way to collect trajectory and hotspots labels on large-scale data.
1 code implementation • 18 Dec 2021 • Shengyu Feng, Subarna Tripathi, Hesham Mostafa, Marcel Nassar, Somdeb Majumdar
Dynamic scene graph generation from a video is challenging due to the temporal dynamics of the scene and the inherent temporal fluctuations of predictions.
no code implementations • 2 Dec 2021 • Sourya Roy, Kyle Min, Subarna Tripathi, Tanaya Guha, Somdeb Majumdar
We address the problem of active speaker detection through a new framework, called SPELL, that learns long-range multimodal graphs to encode the inter-modal relationship between audio and visual data.
no code implementations • 4 Nov 2021 • Sainan Liu, Vincent Nguyen, Yuan Gao, Subarna Tripathi, Zhuowen Tu
Our proposed panoptic 3D parsing framework points to a promising direction in computer vision.
no code implementations • ICCV 2021 • Alakh Desai, Tz-Ying Wu, Subarna Tripathi, Nuno Vasconcelos
Significant effort has been recently devoted to modeling visual relations.
1 code implementation • ICCV 2021 • Kien Nguyen, Subarna Tripathi, Bang Du, Tanaya Guha, Truong Q. Nguyen
Several studies have noted that the naive use of scene graphs from a black-box scene graph generator harms image captioning performance and that scene graph-based captioning models have to incur the overhead of explicit use of image features to generate decent captions.
no code implementations • 13 May 2020 • Brigit Schroeder, Subarna Tripathi
A structured query can capture the complexity of object interactions (e. g. 'woman rides motorcycle') unlike single objects (e. g. 'woman' or 'motorcycle').
no code implementations • 19 Sep 2019 • Brigit Schroeder, Subarna Tripathi, Hanlin Tang
We see a significant performance increase in both metrics that measure the goodness of layout prediction, mean intersection-over-union (mIoU)(52. 3% vs. 49. 2%) and relation score (61. 7% vs. 54. 1%), after the addition of triplet supervision and data augmentation.
no code implementations • 19 Apr 2019 • Subarna Tripathi, Sharath Nittur Sridhar, Sairam Sundaresan, Hanlin Tang
Structured representations such as scene graphs serve as an efficient and compact representation that can be used for downstream rendering or retrieval tasks.
no code implementations • ICLR Workshop LLD 2019 • Subarna Tripathi, Anahita Bhiwandiwalla, Alexei Bastidas, Hanlin Tang
Existing scene graph to image models have two stages: (1) a scene composition stage, and an (2) image generation stage.
no code implementations • 23 Jan 2019 • Byeongkeun Kang, Subarna Tripathi, Truong Q. Nguyen
The proposed method is a promising baseline method for joint image generation and compression using generative adversarial networks.
no code implementations • 11 Jan 2019 • Subarna Tripathi, Anahita Bhiwandiwalla, Alexei Bastidas, Hanlin Tang
Generating realistic images from scene graphs asks neural networks to be able to reason about object relationships and compositionality.
Image Generation from Scene Graphs Open-Ended Question Answering +1
5 code implementations • CVPR 2019 • Kaichun Mo, Shilin Zhu, Angel X. Chang, Li Yi, Subarna Tripathi, Leonidas J. Guibas, Hao Su
We present PartNet: a consistent, large-scale dataset of 3D objects annotated with fine-grained, instance-level, and hierarchical 3D part information.
Ranked #3 on 3D Instance Segmentation on PartNet
no code implementations • 12 Mar 2018 • Subarna Tripathi, Zachary C. Lipton, Truong Q. Nguyen
In this paper, we propose to denoise corrupted images by finding the nearest point on the GAN manifold, recovering latent vectors by minimizing distances in image space.
no code implementations • 16 May 2017 • Subarna Tripathi, Gokce Dane, Byeongkeun Kang, Vasudev Bhaskaran, Truong Nguyen
Thus the consolidation of a CNN-based object detection for an embedded system is more challenging.
no code implementations • 4 Apr 2017 • Subarna Tripathi, Maxwell Collins, Matthew Brown, Serge Belongie
In a more realistic environment, without the oracle keypoints, the proposed deep person instance segmentation model conditioned on human pose achieves 3. 8% to 10. 5% relative improvements comparing with its strongest baseline of a deep network trained only for segmentation.
1 code implementation • 15 Feb 2017 • Zachary C. Lipton, Subarna Tripathi
Generative adversarial networks (GANs) transform latent vectors into visually plausible images.
no code implementations • 20 Dec 2016 • Subarna Tripathi, Brian Guenter
This eliminates the need for an explicit calibration step and automatically compensates for small movements of the headset with respect to the head.
no code implementations • 15 Jul 2016 • Subarna Tripathi, Zachary C. Lipton, Serge Belongie, Truong Nguyen
Then we train a recurrent neural network that takes as input sequences of pseudo-labeled frames and optimizes an objective that encourages both accuracy on the target frame and consistency across consecutive frames.
no code implementations • 20 Jan 2016 • Subarna Tripathi, Serge Belongie, Youngbae Hwang, Truong Nguyen
We further propose a clustering of VOPs which can efficiently be used for detecting objects in video in a streaming fashion.
1 code implementation • 10 Sep 2015 • Byeongkeun Kang, Subarna Tripathi, Truong Q. Nguyen
We train CNNs for the classification of 31 alphabets and numbers using a subset of collected depth data from multiple subjects.
1 code implementation • 4 Sep 2015 • Subarna Tripathi, Serge Belongie, Youngbae Hwang, Truong Nguyen
We explore the efficiency of the CRF inference beyond image level semantic segmentation and perform joint inference in video frames.
no code implementations • 1 Jul 2015 • Subarna Tripathi, Serge Belongie, Truong Nguyen
We explore the efficiency of the CRF inference module beyond image level semantic segmentation.
no code implementations • 14 Feb 2014 • Subarna Tripathi, Youngbae Hwang, Serge Belongie, Truong Nguyen
Despite recent advances in video segmentation, many opportunities remain to improve it using a variety of low and mid-level visual cues.