no code implementations • 6 Apr 2024 • Shufan Li, Konstantinos Kallidromitis, Akash Gokul, Yusuke Kato, Kazuki Kozuka
We present Diffusion-KTO, a novel approach for aligning text-to-image diffusion models by formulating the alignment objective as the maximization of expected human utility.
1 code implementation • 4 Mar 2024 • Ritwik Gupta, Shufan Li, Tyler Zhu, Jitendra Malik, Trevor Darrell, Karttikeya Mangalam
Modern computer vision pipelines handle large images in one of two sub-optimal ways: down-sampling or cropping.
1 code implementation • 8 Feb 2024 • Shufan Li, Harkanwar Singh, Aditya Grover
A recent architecture, Mamba, based on state space models has been shown to achieve comparable performance for modeling text sequences, while scaling linearly with the sequence length.
1 code implementation • 11 Dec 2023 • Shufan Li, Harkanwar Singh, Aditya Grover
We demonstrate that our system can perform a series of novel instruction-guided editing tasks.
1 code implementation • NeurIPS 2023 • Xudong Wang, Shufan Li, Konstantinos Kallidromitis, Yusuke Kato, Kazuki Kozuka, Trevor Darrell
Open-vocabulary image segmentation aims to partition an image into semantic regions according to arbitrary text descriptions.
Ranked #1 on Image Segmentation on Pascal Panoptic Parts
1 code implementation • ICCV 2023 • Colorado J. Reed, Ritwik Gupta, Shufan Li, Sarah Brockman, Christopher Funk, Brian Clipp, Kurt Keutzer, Salvatore Candido, Matt Uyttendaele, Trevor Darrell
Large, pretrained models are commonly finetuned with imagery that is heavily augmented to mimic different conditions and scales, with the resulting models used for various tasks with imagery from a range of spatial scales.
no code implementations • 25 Nov 2022 • Shufan Li, Congxi Lu, Linkai Li, Haoshuai Zhou
We collected two datasets consisting of real camera photos for evaluation.
1 code implementation • 25 Aug 2022 • Akash Gokul, Konstantinos Kallidromitis, Shufan Li, Yusuke Kato, Kazuki Kozuka, Trevor Darrell, Colorado J Reed
Recent works in self-supervised learning have demonstrated strong performance on scene-level dense prediction tasks by pretraining with object-centric or region-based correspondence objectives.
1 code implementation • 17 Dec 2021 • Shufan Li, Congxi Lu, Linkai Li, Jirong Duan, Xinping Fu, Haoshuai Zhou
Audiograms are a particular type of line charts representing individuals' hearing level at various frequencies.