Search Results for author: Shufan Li

Found 10 papers, 8 papers with code

PopAlign: Population-Level Alignment for Fair Text-to-Image Generation

1 code implementation28 Jun 2024 Shufan Li, Harkanwar Singh, Aditya Grover

To address this limitation, we introduce PopAlign, a novel approach for population-level preference optimization, while standard optimization would prefer entire sets of samples over others.

Text-to-Image Generation

Aligning Diffusion Models by Optimizing Human Utility

no code implementations6 Apr 2024 Shufan Li, Konstantinos Kallidromitis, Akash Gokul, Yusuke Kato, Kazuki Kozuka

We present Diffusion-KTO, a novel approach for aligning text-to-image diffusion models by formulating the alignment objective as the maximization of expected human utility.

xT: Nested Tokenization for Larger Context in Large Images

1 code implementation4 Mar 2024 Ritwik Gupta, Shufan Li, Tyler Zhu, Jitendra Malik, Trevor Darrell, Karttikeya Mangalam

There are many downstream applications in which global context matters as much as high frequency details, such as in real-world satellite imagery; in such cases researchers have to make the uncomfortable choice of which information to discard.

Mamba-ND: Selective State Space Modeling for Multi-Dimensional Data

1 code implementation8 Feb 2024 Shufan Li, Harkanwar Singh, Aditya Grover

A recent architecture, Mamba, based on state space models has been shown to achieve comparable performance for modeling text sequences, while scaling linearly with the sequence length.

Action Recognition Weather Forecasting

InstructAny2Pix: Flexible Visual Editing via Multimodal Instruction Following

1 code implementation11 Dec 2023 Shufan Li, Harkanwar Singh, Aditya Grover

We demonstrate that our system can perform a series of novel instruction-guided editing tasks.

Decoder Instruction Following

Scale-MAE: A Scale-Aware Masked Autoencoder for Multiscale Geospatial Representation Learning

1 code implementation ICCV 2023 Colorado J. Reed, Ritwik Gupta, Shufan Li, Sarah Brockman, Christopher Funk, Brian Clipp, Kurt Keutzer, Salvatore Candido, Matt Uyttendaele, Trevor Darrell

Large, pretrained models are commonly finetuned with imagery that is heavily augmented to mimic different conditions and scales, with the resulting models used for various tasks with imagery from a range of spatial scales.

Representation Learning

Refine and Represent: Region-to-Object Representation Learning

1 code implementation25 Aug 2022 Akash Gokul, Konstantinos Kallidromitis, Shufan Li, Yusuke Kato, Kazuki Kozuka, Trevor Darrell, Colorado J Reed

Recent works in self-supervised learning have demonstrated strong performance on scene-level dense prediction tasks by pretraining with object-centric or region-based correspondence objectives.

Object Representation Learning +4

Interpreting Audiograms with Multi-stage Neural Networks

1 code implementation17 Dec 2021 Shufan Li, Congxi Lu, Linkai Li, Jirong Duan, Xinping Fu, Haoshuai Zhou

Audiograms are a particular type of line charts representing individuals' hearing level at various frequencies.

Cannot find the paper you are looking for? You can Submit a new open access paper.