no code implementations • 4 Sep 2024 • Kaiwen Zheng, Yongxin Chen, Hanzi Mao, Ming-Yu Liu, Jun Zhu, Qinsheng Zhang
Masked diffusion models (MDMs) have emerged as a popular research topic for generative modeling of discrete data, thanks to their superior performance over other discrete diffusion models, and are rivaling the auto-regressive models (ARMs) for language modeling tasks.
24 code implementations • ICCV 2023 • Alexander Kirillov, Eric Mintun, Nikhila Ravi, Hanzi Mao, Chloe Rolland, Laura Gustafson, Tete Xiao, Spencer Whitehead, Alexander C. Berg, Wan-Yen Lo, Piotr Dollár, Ross Girshick
We introduce the Segment Anything (SA) project: a new task, model, and dataset for image segmentation.
Ranked #2 on Zero-Shot Instance Segmentation on LVIS v1.0 val
10 code implementations • 30 Mar 2022 • Yanghao Li, Hanzi Mao, Ross Girshick, Kaiming He
This design enables the original ViT architecture to be fine-tuned for object detection without needing to redesign a hierarchical backbone for pre-training.
Ranked #1 on Cross-Domain Few-Shot Object Detection on Clipark1k (mAP metric)
Cross-Domain Few-Shot Object Detection Instance Segmentation +3
49 code implementations • CVPR 2022 • Zhuang Liu, Hanzi Mao, Chao-yuan Wu, Christoph Feichtenhofer, Trevor Darrell, Saining Xie
The "Roaring 20s" of visual recognition began with the introduction of Vision Transformers (ViTs), which quickly superseded ConvNets as the state-of-the-art image classification model.
Ranked #1 on Classification on InDL
3 code implementations • 29 Apr 2021 • Sinong Wang, Han Fang, Madian Khabsa, Hanzi Mao, Hao Ma
Large pre-trained language models (LMs) have demonstrated remarkable ability as few-shot learners.
Ranked #1 on Topic Classification on OS