Search Results for author: Tushar Nagarajan

Found 17 papers, 8 papers with code

Video ReCap: Recursive Captioning of Hour-Long Videos

no code implementations • 20 Feb 2024 • Md Mohaiminul Islam, Ngan Ho, Xitong Yang, Tushar Nagarajan, Lorenzo Torresani, Gedas Bertasius

We utilize a curriculum learning training scheme to learn the hierarchical structure of videos, starting from clip-level captions describing atomic actions, then focusing on segment-level descriptions, and concluding with generating summaries for hour-long videos.

Video Captioning Video Understanding

Paper
Add Code

Detours for Navigating Instructional Videos

no code implementations • 3 Jan 2024 • Kumar Ashutosh, Zihui Xue, Tushar Nagarajan, Kristen Grauman

We introduce the video detours problem for navigating instructional videos.

16k Question Answering +2

Paper
Add Code

Ego-Exo4D: Understanding Skilled Human Activity from First- and Third-Person Perspectives

no code implementations • 30 Nov 2023 • Kristen Grauman, Andrew Westbury, Lorenzo Torresani, Kris Kitani, Jitendra Malik, Triantafyllos Afouras, Kumar Ashutosh, Vijay Baiyya, Siddhant Bansal, Bikram Boote, Eugene Byrne, Zach Chavis, Joya Chen, Feng Cheng, Fu-Jen Chu, Sean Crane, Avijit Dasgupta, Jing Dong, Maria Escobar, Cristhian Forigua, Abrham Gebreselasie, Sanjay Haresh, Jing Huang, Md Mohaiminul Islam, Suyog Jain, Rawal Khirodkar, Devansh Kukreja, Kevin J Liang, Jia-Wei Liu, Sagnik Majumder, Yongsen Mao, Miguel Martin, Effrosyni Mavroudi, Tushar Nagarajan, Francesco Ragusa, Santhosh Kumar Ramakrishnan, Luigi Seminara, Arjun Somayazulu, Yale Song, Shan Su, Zihui Xue, Edward Zhang, Jinxu Zhang, Angela Castillo, Changan Chen, Xinzhu Fu, Ryosuke Furuta, Cristina Gonzalez, Prince Gupta, Jiabo Hu, Yifei HUANG, Yiming Huang, Weslie Khoo, Anush Kumar, Robert Kuo, Sach Lakhavani, Miao Liu, Mi Luo, Zhengyi Luo, Brighid Meredith, Austin Miller, Oluwatumininu Oguntola, Xiaqing Pan, Penny Peng, Shraman Pramanick, Merey Ramazanova, Fiona Ryan, Wei Shan, Kiran Somasundaram, Chenan Song, Audrey Southerland, Masatoshi Tateno, Huiyu Wang, Yuchen Wang, Takuma Yagi, Mingfei Yan, Xitong Yang, Zecheng Yu, Shengxin Cindy Zha, Chen Zhao, Ziwei Zhao, Zhifan Zhu, Jeff Zhuo, Pablo Arbelaez, Gedas Bertasius, David Crandall, Dima Damen, Jakob Engel, Giovanni Maria Farinella, Antonino Furnari, Bernard Ghanem, Judy Hoffman, C. V. Jawahar, Richard Newcombe, Hyun Soo Park, James M. Rehg, Yoichi Sato, Manolis Savva, Jianbo Shi, Mike Zheng Shou, Michael Wray

We present Ego-Exo4D, a diverse, large-scale multimodal multiview video dataset and benchmark challenge.

Video Understanding

Paper
Add Code

AnyMAL: An Efficient and Scalable Any-Modality Augmented Language Model

no code implementations • 27 Sep 2023 • Seungwhan Moon, Andrea Madotto, Zhaojiang Lin, Tushar Nagarajan, Matt Smith, Shashank Jain, Chun-Fu Yeh, Prakash Murugesan, Peyman Heidari, Yue Liu, Kavya Srinet, Babak Damavandi, Anuj Kumar

We present Any-Modality Augmented Language Model (AnyMAL), a unified model that reasons over diverse input modality signals (i. e. text, image, video, audio, IMU motion sensor), and generates textual responses.

Ranked #7 on Video Question Answering on STAR Benchmark

Language Modelling Video Question Answering

Paper
Add Code

Shaping embodied agent behavior with activity-context priors from egocentric video

no code implementations • NeurIPS 2021 • Tushar Nagarajan, Kristen Grauman

For a given object, an activity-context prior represents the set of other compatible objects that are required for activities to succeed (e. g., a knife and cutting board brought together with a tomato are conducive to cutting).

Paper
Add Code

Ego4D: Around the World in 3,000 Hours of Egocentric Video

6 code implementations • CVPR 2022 • Kristen Grauman, Andrew Westbury, Eugene Byrne, Zachary Chavis, Antonino Furnari, Rohit Girdhar, Jackson Hamburger, Hao Jiang, Miao Liu, Xingyu Liu, Miguel Martin, Tushar Nagarajan, Ilija Radosavovic, Santhosh Kumar Ramakrishnan, Fiona Ryan, Jayant Sharma, Michael Wray, Mengmeng Xu, Eric Zhongcong Xu, Chen Zhao, Siddhant Bansal, Dhruv Batra, Vincent Cartillier, Sean Crane, Tien Do, Morrie Doulaty, Akshay Erapalli, Christoph Feichtenhofer, Adriano Fragomeni, Qichen Fu, Abrham Gebreselasie, Cristina Gonzalez, James Hillis, Xuhua Huang, Yifei HUANG, Wenqi Jia, Weslie Khoo, Jachym Kolar, Satwik Kottur, Anurag Kumar, Federico Landini, Chao Li, Yanghao Li, Zhenqiang Li, Karttikeya Mangalam, Raghava Modhugu, Jonathan Munro, Tullie Murrell, Takumi Nishiyasu, Will Price, Paola Ruiz Puentes, Merey Ramazanova, Leda Sari, Kiran Somasundaram, Audrey Southerland, Yusuke Sugano, Ruijie Tao, Minh Vo, Yuchen Wang, Xindi Wu, Takuma Yagi, Ziwei Zhao, Yunyi Zhu, Pablo Arbelaez, David Crandall, Dima Damen, Giovanni Maria Farinella, Christian Fuegen, Bernard Ghanem, Vamsi Krishna Ithapu, C. V. Jawahar, Hanbyul Joo, Kris Kitani, Haizhou Li, Richard Newcombe, Aude Oliva, Hyun Soo Park, James M. Rehg, Yoichi Sato, Jianbo Shi, Mike Zheng Shou, Antonio Torralba, Lorenzo Torresani, Mingfei Yan, Jitendra Malik

We introduce Ego4D, a massive-scale egocentric video dataset and benchmark suite.

De-identification Ethics

5,006

Paper
Code

Environment Predictive Coding for Visual Navigation

no code implementations • ICLR 2022 • Santhosh Kumar Ramakrishnan, Tushar Nagarajan, Ziad Al-Halah, Kristen Grauman

We introduce environment predictive coding, a self-supervised approach to learn environment-level representations for embodied agents.

Representation Learning Self-Supervised Learning +1

Paper
Add Code

Ego-Exo: Transferring Visual Representations from Third-person to First-person Videos

1 code implementation • CVPR 2021 • Yanghao Li, Tushar Nagarajan, Bo Xiong, Kristen Grauman

We introduce an approach for pre-training egocentric video models using large-scale third-person video datasets.

Egocentric Activity Recognition Knowledge Distillation

Paper
Code

Environment Predictive Coding for Embodied Agents

no code implementations • 3 Feb 2021 • Santhosh K. Ramakrishnan, Tushar Nagarajan, Ziad Al-Halah, Kristen Grauman

We introduce environment predictive coding, a self-supervised approach to learn environment-level representations for embodied agents.

Self-Supervised Learning

Paper
Add Code

Differentiable Causal Discovery Under Unmeasured Confounding

1 code implementation • 14 Oct 2020 • Rohit Bhattacharya, Tushar Nagarajan, Daniel Malinsky, Ilya Shpitser

In this work, we derive differentiable algebraic constraints that fully characterize the space of ancestral ADMGs, as well as more general classes of ADMGs, arid ADMGs and bow-free ADMGs, that capture all equality restrictions on the observed variables.

Causal Discovery

Paper
Code

Learning Affordance Landscapes for Interaction Exploration in 3D Environments

1 code implementation • NeurIPS 2020 • Tushar Nagarajan, Kristen Grauman

We introduce a reinforcement learning approach for exploration for interaction, whereby an embodied agent autonomously discovers the affordance landscape of a new unmapped 3D environment (such as an unfamiliar kitchen).

Paper
Code

EGO-TOPO: Environment Affordances from Egocentric Video

1 code implementation • CVPR 2020 • Tushar Nagarajan, Yanghao Li, Christoph Feichtenhofer, Kristen Grauman

We introduce a model for environment affordances that is learned directly from egocentric video.

Paper
Code

Grounded Human-Object Interaction Hotspots from Video (Extended Abstract)

no code implementations • 3 Jun 2019 • Tushar Nagarajan, Christoph Feichtenhofer, Kristen Grauman

Learning how to interact with objects is an important step towards embodied visual intelligence, but existing techniques suffer from heavy supervision or sensing requirements.

Human-Object Interaction Detection Object +1

Paper
Add Code

Grounded Human-Object Interaction Hotspots from Video

1 code implementation • ICCV 2019 • Tushar Nagarajan, Christoph Feichtenhofer, Kristen Grauman

Learning how to interact with objects is an important step towards embodied visual intelligence, but existing techniques suffer from heavy supervision or sensing requirements.

Ranked #3 on Video-to-image Affordance Grounding on EPIC-Hotspot

Human-Object Interaction Detection Object +3

Paper
Code

Attributes as Operators: Factorizing Unseen Attribute-Object Compositions

1 code implementation • ECCV 2018 • Tushar Nagarajan, Kristen Grauman

In addition, we show that not only can our model recognize unseen compositions robustly in an open-world setting, it can also generalize to compositions where objects themselves were unseen during training.

Ranked #5 on Image Retrieval with Multi-Modal Query on MIT-States

Attribute Compositional Zero-Shot Learning +2

Paper
Code

BlockDrop: Dynamic Inference Paths in Residual Networks

1 code implementation • CVPR 2018 • Zuxuan Wu, Tushar Nagarajan, Abhishek Kumar, Steven Rennie, Larry S. Davis, Kristen Grauman, Rogerio Feris

Very deep convolutional neural networks offer excellent recognition results, yet their computational expense limits their impact for many real-world applications.

139

Paper
Code

CANDiS: Coupled & Attention-Driven Neural Distant Supervision

no code implementations • 26 Oct 2017 • Tushar Nagarajan, Sharmistha, Partha Talukdar

The unsupervised nature of this technique allows it to scale to web-scale relation extraction tasks, at the expense of noise in the training data.

Relation Relation Extraction

Paper
Add Code

Cannot find the paper you are looking for? You can Submit a new open access paper.