10 code implementations • 1 Aug 2024 • Nikhila Ravi, Valentin Gabeur, Yuan-Ting Hu, Ronghang Hu, Chaitanya Ryali, Tengyu Ma, Haitham Khedr, Roman Rädle, Chloe Rolland, Laura Gustafson, Eric Mintun, Junting Pan, Kalyan Vasudev Alwala, Nicolas Carion, Chao-yuan Wu, Ross Girshick, Piotr Dollár, Christoph Feichtenhofer
We present Segment Anything Model 2 (SAM 2), a foundation model towards solving promptable visual segmentation in images and videos.
Ranked #1 on
Semi-Supervised Video Object Segmentation
on MOSE
(using extra training data)
no code implementations • 9 Nov 2023 • Daniel Bolya, Chaitanya Ryali, Judy Hoffman, Christoph Feichtenhofer
To fix it, we introduce a simple absolute window position embedding strategy, which solves the bug outright in Hiera and allows us to increase both speed and performance of the model in ViTDet.
4 code implementations • 1 Jun 2023 • Chaitanya Ryali, Yuan-Ting Hu, Daniel Bolya, Chen Wei, Haoqi Fan, Po-Yao Huang, Vaibhav Aggarwal, Arkabandhu Chowdhury, Omid Poursaeed, Judy Hoffman, Jitendra Malik, Yanghao Li, Christoph Feichtenhofer
Modern hierarchical vision transformers have added several vision-specific components in the pursuit of supervised classification performance.
Ranked #1 on
Image Classification
on iNaturalist 2019
(using extra training data)
1 code implementation • NeurIPS 2023 • Po-Yao Huang, Vasu Sharma, Hu Xu, Chaitanya Ryali, Haoqi Fan, Yanghao Li, Shang-Wen Li, Gargi Ghosh, Jitendra Malik, Christoph Feichtenhofer
We present Masked Audio-Video Learners (MAViL) to train audio-visual representations.
no code implementations • NeurIPS Workshop ImageNet_PPF 2021 • Chaitanya Ryali, David J. Schwab, Ari S. Morcos
Through a systematic, comprehensive investigation, we show that background augmentations lead to improved generalization with substantial improvements ($\sim$1-2% on ImageNet) in performance across a spectrum of state-of-the-art self-supervised methods (MoCo-v2, BYOL, SwAV) on a variety of tasks, even enabling performance on par with the supervised baseline.
no code implementations • NeurIPS 2018 • Chaitanya Ryali, Gautam Reddy, Angela J. Yu
We derive the theoretical relationship between DBM and EXP, and show that EXP gains computational efficiency by foregoing the representation of inferential uncertainty (as does the delta rule), but that it nevertheless achieves near-Bayesian performance due to its ability to incorporate a "persistent prior" influence unique to DBM and absent from the other algorithms.
no code implementations • NeurIPS 2018 • Chaitanya Ryali, Angela J. Yu
This statistical coding cost account explains both BiA, where facial blends generally have higher likelihood than ``parent faces'', and UiA, when the preceding context or task restricts face representation to a task-relevant subset of features, thus redefining statistical typicality and encoding cost within that subspace.