Search Results for author: Daniel Bolya

Found 11 papers, 8 papers with code

Window Attention is Bugged: How not to Interpolate Position Embeddings

no code implementations • 9 Nov 2023 • Daniel Bolya, Chaitanya Ryali, Judy Hoffman, Christoph Feichtenhofer

To fix it, we introduce a simple absolute window position embedding strategy, which solves the bug outright in Hiera and allows us to increase both speed and performance of the model in ViTDet.

Position

Paper
Add Code

Hiera: A Hierarchical Vision Transformer without the Bells-and-Whistles

2 code implementations • 1 Jun 2023 • Chaitanya Ryali, Yuan-Ting Hu, Daniel Bolya, Chen Wei, Haoqi Fan, Po-Yao Huang, Vaibhav Aggarwal, Arkabandhu Chowdhury, Omid Poursaeed, Judy Hoffman, Jitendra Malik, Yanghao Li, Christoph Feichtenhofer

Modern hierarchical vision transformers have added several vision-specific components in the pursuit of supervised classification performance.

Ranked #1 on Image Classification on iNaturalist 2019 (using extra training data)

Action Classification Action Recognition In Videos +4

692

Paper
Code

ZipIt! Merging Models from Different Tasks without Training

1 code implementation • 4 May 2023 • George Stoica, Daniel Bolya, Jakob Bjorner, Pratik Ramesh, Taylor Hearn, Judy Hoffman

While this works for models trained on the same task, we find that this fails to account for the differences in models trained on disjoint tasks.

257

Paper
Code

Token Merging for Fast Stable Diffusion

3 code implementations • 30 Mar 2023 • Daniel Bolya, Judy Hoffman

In the process, we speed up image generation by up to 2x and reduce memory consumption by up to 5. 6x.

Image Generation

3,799

Paper
Code

Token Merging: Your ViT But Faster

3 code implementations • 17 Oct 2022 • Daniel Bolya, Cheng-Yang Fu, Xiaoliang Dai, Peizhao Zhang, Christoph Feichtenhofer, Judy Hoffman

Off-the-shelf, ToMe can 2x the throughput of state-of-the-art ViT-L @ 512 and ViT-H @ 518 models on images and 2. 2x the throughput of ViT-L on video with only a 0. 2-0. 3% accuracy drop in each case.

Ranked #13 on Efficient ViTs on ImageNet-1K (with DeiT-S)

Efficient ViTs

1,207

Paper
Code

Hydra Attention: Efficient Attention with Many Heads

no code implementations • 15 Sep 2022 • Daniel Bolya, Cheng-Yang Fu, Xiaoliang Dai, Peizhao Zhang, Judy Hoffman

While transformers have begun to dominate many tasks in vision, applying them to large images is still computationally difficult.

Paper
Add Code

Scalable Diverse Model Selection for Accessible Transfer Learning

1 code implementation • NeurIPS 2021 • Daniel Bolya, Rohit Mittapalli, Judy Hoffman

In this paper, we formalize this setting as "Scalable Diverse Model Selection" and propose several benchmarks for evaluating on this task.

Model Selection Transfer Learning

Paper
Code

Likelihood Landscapes: A Unifying Principle Behind Many Adversarial Defenses

no code implementations • 25 Aug 2020 • Fu Lin, Rohit Mittapalli, Prithvijit Chattopadhyay, Daniel Bolya, Judy Hoffman

Convolutional Neural Networks have been shown to be vulnerable to adversarial examples, which are known to locate in subspaces close to where normal data lies but are not naturally occurring and of low probability.

Adversarial Defense Adversarial Robustness

Paper
Add Code

TIDE: A General Toolbox for Identifying Object Detection Errors

2 code implementations • ECCV 2020 • Daniel Bolya, Sean Foley, James Hays, Judy Hoffman

We introduce TIDE, a framework and associated toolbox for analyzing the sources of error in object detection and instance segmentation algorithms.

Instance Segmentation object-detection +2

687

Paper
Code

YOLACT++: Better Real-time Instance Segmentation

36 code implementations • 3 Dec 2019 • Daniel Bolya, Chong Zhou, Fanyi Xiao, Yong Jae Lee

Then we produce instance masks by linearly combining the prototypes with the mask coefficients.

Ranked #15 on Real-time Instance Segmentation on MSCOCO (using extra training data)

Real-time Instance Segmentation Segmentation +1

4,924

Paper
Code

YOLACT: Real-time Instance Segmentation

48 code implementations • ICCV 2019 • Daniel Bolya, Chong Zhou, Fanyi Xiao, Yong Jae Lee

Then we produce instance masks by linearly combining the prototypes with the mask coefficients.

Ranked #21 on Real-time Instance Segmentation on MSCOCO (using extra training data)

Real-time Instance Segmentation Segmentation +2

27,790

Paper
Code

Cannot find the paper you are looking for? You can Submit a new open access paper.