Search Results for author: Achal Dave

Found 24 papers, 13 papers with code

Linearizing Large Language Models

1 code implementation10 May 2024 Jean Mercat, Igor Vasiljevic, Sedrick Keh, Kushal Arora, Achal Dave, Adrien Gaidon, Thomas Kollar

Linear transformers have emerged as a subquadratic-time alternative to softmax attention and have garnered significant interest due to their fixed-size recurrent state that lowers inference cost.

In-Context Learning

pix2gestalt: Amodal Segmentation by Synthesizing Wholes

1 code implementation25 Jan 2024 Ege Ozguroglu, Ruoshi Liu, Dídac Surís, Dian Chen, Achal Dave, Pavel Tokmakov, Carl Vondrick

We introduce pix2gestalt, a framework for zero-shot amodal segmentation, which learns to estimate the shape and appearance of whole objects that are only partially visible behind occlusions.

3D Reconstruction Object Recognition +1

Understanding Video Transformers via Universal Concept Discovery

no code implementations19 Jan 2024 Matthew Kowal, Achal Dave, Rares Ambrus, Adrien Gaidon, Konstantinos G. Derpanis, Pavel Tokmakov

Concretely, we seek to explain the decision-making process of video transformers based on high-level, spatiotemporal concepts that are automatically discovered.

Decision Making Fine-grained Action Recognition +3

TAO-Amodal: A Benchmark for Tracking Any Object Amodally

1 code implementation19 Dec 2023 Cheng-Yen Hsieh, Kaihua Chen, Achal Dave, Tarasha Khurana, Deva Ramanan

Amodal perception, the ability to comprehend complete object structures from partial visibility, is a fundamental skill, even for infants.

Amodal Tracking Autonomous Driving +3

Zero-Shot Open-Vocabulary Tracking with Large Pre-Trained Models

no code implementations10 Oct 2023 Wen-Hsuan Chu, Adam W. Harley, Pavel Tokmakov, Achal Dave, Leonidas Guibas, Katerina Fragkiadaki

This begs the question: can we re-purpose these large-scale pre-trained static image models for open-vocabulary video tracking?

Object Object Tracking +5

Shape of You: Precise 3D shape estimations for diverse body types

no code implementations14 Apr 2023 Rohan Sarkar, Achal Dave, Gerard Medioni, Benjamin Biggs

This paper presents Shape of You (SoY), an approach to improve the accuracy of 3D body shape estimation for vision-based clothing recommendation systems.

3D Human Reconstruction 3D Human Shape Estimation +1

Towards Long-Tailed 3D Detection

1 code implementation16 Nov 2022 Neehar Peri, Achal Dave, Deva Ramanan, Shu Kong

Moreover, semantic classes are often organized within a hierarchy, e. g., tail classes such as child and construction-worker are arguably subclasses of pedestrian.

Differentiable Raycasting for Self-supervised Occupancy Forecasting

1 code implementation4 Oct 2022 Tarasha Khurana, Peiyun Hu, Achal Dave, Jason Ziglar, David Held, Deva Ramanan

Self-supervised representations proposed for large-scale planning, such as ego-centric freespace, confound these two motions, making the representation difficult to use for downstream motion planners.

Autonomous Driving Motion Planning +1

BURST: A Benchmark for Unifying Object Recognition, Segmentation and Tracking in Video

1 code implementation25 Sep 2022 Ali Athar, Jonathon Luiten, Paul Voigtlaender, Tarasha Khurana, Achal Dave, Bastian Leibe, Deva Ramanan

Multiple existing benchmarks involve tracking and segmenting objects in video e. g., Video Object Segmentation (VOS) and Multi-Object Tracking and Segmentation (MOTS), but there is little interaction between them due to the use of disparate benchmark datasets and metrics (e. g. J&F, mAP, sMOTSA).

Ranked #4 on Long-tail Video Object Segmentation on BURST-val (using extra training data)

Long-tail Video Object Segmentation Multi-Object Tracking +6

Data Determines Distributional Robustness in Contrastive Language Image Pre-training (CLIP)

2 code implementations3 May 2022 Alex Fang, Gabriel Ilharco, Mitchell Wortsman, Yuhao Wan, Vaishaal Shankar, Achal Dave, Ludwig Schmidt

Contrastively trained language-image models such as CLIP, ALIGN, and BASIC have demonstrated unprecedented robustness to multiple challenging natural distribution shifts.

Ranked #94 on Image Classification on ObjectNet (using extra training data)

Image Classification

Opening Up Open World Tracking

no code implementations CVPR 2022 Yang Liu, Idil Esen Zulfikar, Jonathon Luiten, Achal Dave, Deva Ramanan, Bastian Leibe, Aljoša Ošep, Laura Leal-Taixé

A benchmark that would allow us to perform an apple-to-apple comparison of existing efforts is a crucial first step towards advancing this important research field.

Ranked #3 on Open-World Video Segmentation on BURST-val (using extra training data)

Multi-Object Tracking Object +1

Opening up Open-World Tracking

no code implementations22 Apr 2021 Yang Liu, Idil Esen Zulfikar, Jonathon Luiten, Achal Dave, Deva Ramanan, Bastian Leibe, Aljoša Ošep, Laura Leal-Taixé

We hope to open a new front in multi-object tracking research that will hopefully bring us a step closer to intelligent systems that can operate safely in the real world.

Multi-Object Tracking Object

Detecting Invisible People

1 code implementation ICCV 2021 Tarasha Khurana, Achal Dave, Deva Ramanan

We demonstrate that current detection and tracking systems perform dramatically worse on this task.

Monocular Depth Estimation Object +3

TAO: A Large-Scale Benchmark for Tracking Any Object

no code implementations ECCV 2020 Achal Dave, Tarasha Khurana, Pavel Tokmakov, Cordelia Schmid, Deva Ramanan

To this end, we ask annotators to label objects that move at any point in the video, and give names to them post factum.

Multi-Object Tracking Object +2

Learning to Track Any Object

no code implementations25 Oct 2019 Achal Dave, Pavel Tokmakov, Cordelia Schmid, Deva Ramanan

Moreover, at test time the same network can be applied to detection and tracking, resulting in a unified approach for the two tasks.

Instance Segmentation Object +5

Do Image Classifiers Generalize Across Time?

1 code implementation ICCV 2021 Vaishaal Shankar, Achal Dave, Rebecca Roelofs, Deva Ramanan, Benjamin Recht, Ludwig Schmidt

Additionally, we evaluate three detection models and show that natural perturbations induce both classification as well as localization errors, leading to a median drop in detection mAP of 14 points.

General Classification Video Object Detection

Towards Segmenting Anything That Moves

1 code implementation11 Feb 2019 Achal Dave, Pavel Tokmakov, Deva Ramanan

To address this concern, we propose two new benchmarks for generic, moving object detection, and show that our model matches top-down methods on common categories, while significantly out-performing both top-down and bottom-up methods on never-before-seen categories.

Action Detection Instance Segmentation +7

Predictive-Corrective Networks for Action Detection

no code implementations CVPR 2017 Achal Dave, Olga Russakovsky, Deva Ramanan

While deep feature learning has revolutionized techniques for static-image understanding, the same does not quite hold for video processing.

Action Detection Optical Flow Estimation +2

Cannot find the paper you are looking for? You can Submit a new open access paper.