2 code implementations • CVPR 2023 • Dina Bashkirova, Jose Lezama, Kihyuk Sohn, Kate Saenko, Irfan Essa
We show that intermediate self-attention maps of a masked generative transformer encode important structural information of the input image, such as scene layout and object shape, and we propose a novel sampling method based on this observation to enable structure-guided generation.
5 code implementations • 2 Jan 2023 • Huiwen Chang, Han Zhang, Jarred Barber, AJ Maschinot, Jose Lezama, Lu Jiang, Ming-Hsuan Yang, Kevin Murphy, William T. Freeman, Michael Rubinstein, Yuanzhen Li, Dilip Krishnan
Compared to pixel-space diffusion models, such as Imagen and DALL-E 2, Muse is significantly more efficient due to the use of discrete tokens and requiring fewer sampling iterations; compared to autoregressive models, such as Parti, Muse is more efficient due to the use of parallel decoding.
Ranked #1 on Text-to-Image Generation on MS-COCO (FID metric)
no code implementations • ECCV 2018 • Qiang Qiu, Jose Lezama, Alex Bronstein, Guillermo Sapiro
In this paper, we introduce a random forest semantic hashing scheme that embeds tiny convolutional neural networks (CNN) into shallow random forests, with near-optimal information-theoretic code aggregation among trees.
no code implementations • CVPR 2017 • Jose Lezama, Qiang Qiu, Guillermo Sapiro
We observe that it is often equally effective to perform hallucination to input NIR images or low-rank embedding to output deep features for a VIS deep model for cross-spectral recognition.
no code implementations • CVPR 2014 • Jose Lezama, Rafael Grompone von Gioi, Gregory Randall, Jean-Michel Morel
We present a novel method for automatic vanishing point detection based on primal and dual point alignment detection.