1 code implementation • 25 Jan 2024 • Letian Fu, Long Lian, Renhao Wang, Baifeng Shi, Xudong Wang, Adam Yala, Trevor Darrell, Alexei A. Efros, Ken Goldberg
In this work, we re-examine inter-patch dependencies in the decoding mechanism of masked autoencoders (MAE).
1 code implementation • 28 Dec 2023 • Dantong Niu, Xudong Wang, Xinyang Han, Long Lian, Roei Herzig, Trevor Darrell
Several unsupervised image segmentation approaches have been proposed which eliminate the need for dense manually-annotated segmentation masks; current models separately handle either semantic segmentation (e. g., STEGO) or class-agnostic instance segmentation (e. g., CutLER), but not both (i. e., panoptic segmentation).
Ranked #1 on Unsupervised Panoptic Segmentation on COCO val2017
no code implementations • 27 Nov 2023 • Tsung-Han Wu, Long Lian, Joseph E. Gonzalez, Boyi Li, Trevor Darrell
Steered by an LLM controller, SLD turns text-to-image generation into an iterative closed-loop process, ensuring correctness in the resulting image.
no code implementations • 29 Sep 2023 • Long Lian, Baifeng Shi, Adam Yala, Trevor Darrell, Boyi Li
We show that LLMs are able to understand complex spatiotemporal dynamics from text alone and generate layouts that align closely with both the prompts and the object motion patterns typically observed in the real world.
1 code implementation • 23 May 2023 • Long Lian, Boyi Li, Adam Yala, Trevor Darrell
Our method significantly outperforms the base diffusion model and several strong baselines in accurately generating images according to prompts that require various capabilities, doubling the generation accuracy across four tasks on average.
1 code implementation • CVPR 2023 • Long Lian, Zhirong Wu, Stella X. Yu
The Gestalt law of common fate, i. e., what move at the same speed belong together, has inspired unsupervised object discovery based on motion segmentation.
Ranked #1 on Unsupervised Object Segmentation on FBMS-59
1 code implementation • ICCV 2023 • Xiuyu Li, Yijiang Liu, Long Lian, Huanrui Yang, Zhen Dong, Daniel Kang, Shanghang Zhang, Kurt Keutzer
We propose a novel PTQ method specifically tailored towards the unique multi-timestep pipeline and model architecture of the diffusion models, which compresses the noise estimation network to accelerate the generation process.
no code implementations • 17 Dec 2022 • Long Lian, Zhirong Wu, Stella X. Yu
Previous methods in unsupervised video object segmentation (UVOS) have demonstrated the effectiveness of motion as either input or supervision for segmentation.
1 code implementation • CVPR 2022 • Xudong Wang, Zhirong Wu, Long Lian, Stella X. Yu
Our key insight is that pseudo-labels are naturally imbalanced due to intrinsic data similarity, even when a model is trained on balanced source data and evaluated on balanced target data.
Ranked #1 on Few-Shot Image Classification on ImageNet - 0-Shot (using extra training data)
1 code implementation • 6 Oct 2021 • Xudong Wang, Long Lian, Stella X. Yu
Intuitively, no matter what the downstream task is, instances to be labeled must be representative and diverse: The former would facilitate label propagation to unlabeled data, whereas the latter would ensure coverage of the entire dataset.
Active Learning Semi-Supervised Image Classification (Cold Start)
no code implementations • CVPR 2021 • Xudong Wang, Long Lian, Stella X. Yu
Existing methods focus on training an RL policy that is universal to changing visual domains, whereas we focus on extracting visual foreground that is universal, feeding clean invariant vision to the RL policy learner.
2 code implementations • ICLR 2021 • Xudong Wang, Long Lian, Zhongqi Miao, Ziwei Liu, Stella X. Yu
We take a dynamic view of the training data and provide a principled model bias and variance analysis as the training data fluctuates: Existing long-tail classifiers invariably increase the model variance and the head-tail model bias gap remains large, due to more and larger confusion with hard negatives for the tail.
Ranked #22 on Long-tail Learning on iNaturalist 2018