Search Results for author: Jathushan Rajasegaran

Found 29 papers, 12 papers with code

Perception Encoder: The best visual embeddings are not at the output of the network

1 code implementation17 Apr 2025 Daniel Bolya, Po-Yao Huang, Peize Sun, Jang Hyun Cho, Andrea Madotto, Chen Wei, Tengyu Ma, Jiale Zhi, Jathushan Rajasegaran, Hanoona Rasheed, Junke Wang, Marco Monteiro, Hu Xu, Shiyu Dong, Nikhila Ravi, Daniel Li, Piotr Dollár, Christoph Feichtenhofer

Together with the core contrastive checkpoint, our PE family of models achieves state-of-the-art performance on a wide variety of tasks, including zero-shot image and video classification and retrieval; document, image, and video Q&A; and spatial tasks such as detection, depth estimation, and tracking.

 Ranked #1 on Object Detection on COCO minival (using extra training data)

Depth Estimation Language Modeling +4

Poly-Autoregressive Prediction for Modeling Interactions

no code implementations12 Feb 2025 Neerja Thakkar, Tara Sadjadpour, Jathushan Rajasegaran, Shiry Ginosar, Jitendra Malik

At its core, PAR represents the behavior of all agents as a sequence of tokens, each representing an agent's state at a specific timestep.

Autonomous Vehicles Prediction +1

Gaussian Masked Autoencoders

no code implementations6 Jan 2025 Jathushan Rajasegaran, Xinlei Chen, Rulilong Li, Christoph Feichtenhofer, Jitendra Malik, Shiry Ginosar

Our approach, named Gaussian Masked Autoencoder, or GMAE, aims to learn semantic abstractions and spatial understanding jointly.

Edge Detection Representation Learning +2

Scaling Properties of Diffusion Models for Perceptual Tasks

no code implementations12 Nov 2024 Rahul Ravishankar, Zeeshan Patel, Jathushan Rajasegaran, Jitendra Malik

In this paper, we argue that iterative computation with diffusion models offers a powerful paradigm for not only generation but also visual perception tasks.

Depth Estimation Image-to-Image Translation +1

Synergy and Synchrony in Couple Dances

no code implementations6 Sep 2024 Vongani Maluleke, Lea Müller, Jathushan Rajasegaran, Georgios Pavlakos, Shiry Ginosar, Angjoo Kanazawa, Jitendra Malik

Our contributions are a demonstration of the advantages of socially conditioned future motion prediction and an in-the-wild, couple dance video dataset to enable future research in this direction.

motion prediction Prediction

FewShotNeRF: Meta-Learning-based Novel View Synthesis for Rapid Scene-Specific Adaptation

no code implementations9 Aug 2024 Piraveen Sivakumar, Paul Janson, Jathushan Rajasegaran, Thanuja Ambegoda

In this paper, we address the challenge of generating novel views of real-world objects with limited multi-view images through our proposed approach, FewShotNeRF.

Meta-Learning NeRF +1

EgoPet: Egomotion and Interaction Data from an Animal's Perspective

no code implementations15 Apr 2024 Amir Bar, Arya Bakhtiar, Danny Tran, Antonio Loquercio, Jathushan Rajasegaran, Yann Lecun, Amir Globerson, Trevor Darrell

Animals perceive the world to plan their actions and interact with other agents to accomplish complex tasks, demonstrating capabilities that are still unmatched by AI systems.

Humanoid Locomotion as Next Token Prediction

no code implementations29 Feb 2024 Ilija Radosavovic, Bike Zhang, Baifeng Shi, Jathushan Rajasegaran, Sarthak Kamat, Trevor Darrell, Koushil Sreenath, Jitendra Malik

We cast real-world humanoid control as a next token prediction problem, akin to predicting the next word in language.

Humanoid Control Prediction

Synthesizing Moving People with 3D Control

no code implementations19 Jan 2024 Boyi Li, Junming Chen, Jathushan Rajasegaran, Yossi Gandelsman, Alexei A. Efros, Jitendra Malik

This disentangled approach allows our method to generate a sequence of images that are faithful to the target motion in the 3D pose and, to the input image in terms of visual similarity.

Fully Online Meta-Learning Without Task Boundaries

no code implementations1 Feb 2022 Jathushan Rajasegaran, Chelsea Finn, Sergey Levine

In this paper, we study how meta-learning can be applied to tackle online problems of this nature, simultaneously adapting to changing tasks and input distributions and meta-training the model in order to adapt more quickly in the future.

Meta-Learning

Tracking People by Predicting 3D Appearance, Location and Pose

no code implementations CVPR 2022 Jathushan Rajasegaran, Georgios Pavlakos, Angjoo Kanazawa, Jitendra Malik

For a future frame, we compute the similarity between the predicted state of a tracklet and the single frame observations in a probabilistic manner.

Tracking People by Predicting 3D Appearance, Location & Pose

no code implementations8 Dec 2021 Jathushan Rajasegaran, Georgios Pavlakos, Angjoo Kanazawa, Jitendra Malik

For a future frame, we compute the similarity between the predicted state of a tracklet and the single frame observations in a probabilistic manner.

Tracking People with 3D Representations

1 code implementation NeurIPS 2021 Jathushan Rajasegaran, Georgios Pavlakos, Angjoo Kanazawa, Jitendra Malik

We find that 3D representations are more effective than 2D representations for tracking in these settings, and we obtain state-of-the-art performance.

3D geometry

Mitigating Mode Collapse by Sidestepping Catastrophic Forgetting

no code implementations1 Jan 2021 Karttikeya Mangalam, Rohin Garg, Jathushan Rajasegaran, Taesung Park

Generative Adversarial Networks (GANs) are a class of generative models used for various applications, but they have been known to suffer from the mode collapse problem, in which some modes of the target distribution are ignored by the generator.

Continual Learning

Meta-learning the Learning Trends Shared Across Tasks

no code implementations19 Oct 2020 Jathushan Rajasegaran, Salman Khan, Munawar Hayat, Fahad Shahbaz Khan, Mubarak Shah

This demonstrates their ability to acquire transferable knowledge, a capability that is central to human learning.

Meta-Learning

Self-supervised Knowledge Distillation for Few-shot Learning

2 code implementations17 Jun 2020 Jathushan Rajasegaran, Salman Khan, Munawar Hayat, Fahad Shahbaz Khan, Mubarak Shah

Our experiments show that, even in the first stage, self-supervision can outperform current state-of-the-art methods, with further gains achieved by our second stage distillation process.

Few-Shot Image Classification Few-Shot Learning +2

A Multi-modal Neural Embeddings Approach for Detecting Mobile Counterfeit Apps: A Case Study on Google Play Store

no code implementations2 Jun 2020 Naveen Karunanayake, Jathushan Rajasegaran, Ashanie Gunathillake, Suranga Seneviratne, Guillaume Jourjon

We show that a novel approach of combining content embeddings and style embeddings outperforms the baseline methods for image similarity such as SIFT, SURF, and various image hashing methods.

iTAML: An Incremental Task-Agnostic Meta-learning Approach

1 code implementation CVPR 2020 Jathushan Rajasegaran, Salman Khan, Munawar Hayat, Fahad Shahbaz Khan, Mubarak Shah

In this paper, we hypothesize this problem can be avoided by learning a set of generalized parameters, that are neither specific to old nor new tasks.

Incremental Learning Meta-Learning

Random Path Selection for Continual Learning

1 code implementation NeurIPS 2019 Jathushan Rajasegaran, Munawar Hayat, Salman H. Khan, Fahad Shahbaz Khan, Ling Shao

In order to maintain an equilibrium between previous and newly acquired knowledge, we propose a simple controller to dynamically balance the model plasticity.

Continual Learning Incremental Learning +2

An Adaptive Random Path Selection Approach for Incremental Learning

1 code implementation3 Jun 2019 Jathushan Rajasegaran, Munawar Hayat, Salman Khan, Fahad Shahbaz Khan, Ling Shao, Ming-Hsuan Yang

In a conventional supervised learning setting, a machine learning model has access to examples of all object classes that are desired to be recognized during the inference stage.

Incremental Learning Knowledge Distillation +1

DeepCaps: Going Deeper with Capsule Networks

5 code implementations CVPR 2019 Jathushan Rajasegaran, Vinoj Jayasundara, Sandaru Jayasekara, Hirunima Jayasekara, Suranga Seneviratne, Ranga Rodrigo

Capsule Network is a promising concept in deep learning, yet its true potential is not fully realized thus far, providing sub-par performance on several key benchmark datasets with complex data.

Decoder

TextCaps : Handwritten Character Recognition with Very Small Datasets

3 code implementations17 Apr 2019 Vinoj Jayasundara, Sandaru Jayasekara, Hirunima Jayasekara, Jathushan Rajasegaran, Suranga Seneviratne, Ranga Rodrigo

Our system is useful in character recognition for localized languages that lack much labeled training data and even in other related more general contexts such as object recognition.

Few-Shot Image Classification Image Generation

A Neural Embeddings Approach for Detecting Mobile Counterfeit Apps

no code implementations26 Apr 2018 Jathushan Rajasegaran, Suranga Seneviratne, Guillaume Jourjon

We show that further performance increases can be achieved by combining style embeddings with content embeddings.

Cannot find the paper you are looking for? You can Submit a new open access paper.