Search Results for author: José Lezama

Found 14 papers, 8 papers with code

VideoPoet: A Large Language Model for Zero-Shot Video Generation

no code implementations • 21 Dec 2023 • Dan Kondratyuk, Lijun Yu, Xiuye Gu, José Lezama, Jonathan Huang, Grant Schindler, Rachel Hornung, Vighnesh Birodkar, Jimmy Yan, Ming-Chang Chiu, Krishna Somandepalli, Hassan Akbari, Yair Alon, Yong Cheng, Josh Dillon, Agrim Gupta, Meera Hahn, Anja Hauth, David Hendon, Alonso Martinez, David Minnen, Mikhail Sirotenko, Kihyuk Sohn, Xuan Yang, Hartwig Adam, Ming-Hsuan Yang, Irfan Essa, Huisheng Wang, David A. Ross, Bryan Seybold, Lu Jiang

We present VideoPoet, a language model capable of synthesizing high-quality video, with matching audio, from a large variety of conditioning signals.

Ranked #3 on Text-to-Video Generation on MSR-VTT

Language Modelling Large Language Model +2

Paper
Add Code

Photorealistic Video Generation with Diffusion Models

no code implementations • 11 Dec 2023 • Agrim Gupta, Lijun Yu, Kihyuk Sohn, Xiuye Gu, Meera Hahn, Li Fei-Fei, Irfan Essa, Lu Jiang, José Lezama

We present W. A. L. T, a transformer-based approach for photorealistic video generation via diffusion modeling.

Ranked #1 on Video Prediction on Kinetics-600 12 frames, 64x64

Text-to-Video Generation Video Generation +1

Paper
Add Code

Language Model Beats Diffusion -- Tokenizer is Key to Visual Generation

no code implementations • 9 Oct 2023 • Lijun Yu, José Lezama, Nitesh B. Gundavarapu, Luca Versari, Kihyuk Sohn, David Minnen, Yong Cheng, Vighnesh Birodkar, Agrim Gupta, Xiuye Gu, Alexander G. Hauptmann, Boqing Gong, Ming-Hsuan Yang, Irfan Essa, David A. Ross, Lu Jiang

While Large Language Models (LLMs) are the dominant models for generative tasks in language, they do not perform as well as diffusion models on image and video generation.

Ranked #2 on Video Prediction on Kinetics-600 12 frames, 64x64

Action Recognition Image Generation +4

Paper
Add Code

Blind Motion Deblurring with Pixel-Wise Kernel Estimation via Kernel Prediction Networks

1 code implementation • 5 Aug 2023 • Guillermo Carbajal, Patricia Vitoria, José Lezama, Pablo Musé

Then, a second network trained jointly with the first one, unrolls a non-blind deconvolution method using the motion kernel field estimated by the first network.

Deblurring

Paper
Code

Scaling Painting Style Transfer

no code implementations • 27 Dec 2022 • Bruno Galerne, Lara Raad, José Lezama, Jean-Michel Morel

Neural style transfer is a deep learning technique that produces an unprecedentedly rich style transfer from a style image to a content image and is particularly impressive when it comes to transferring style from a painting to an image.

Style Transfer

Paper
Add Code

MAGVIT: Masked Generative Video Transformer

1 code implementation • CVPR 2023 • Lijun Yu, Yong Cheng, Kihyuk Sohn, José Lezama, Han Zhang, Huiwen Chang, Alexander G. Hauptmann, Ming-Hsuan Yang, Yuan Hao, Irfan Essa, Lu Jiang

We introduce the MAsked Generative VIdeo Transformer, MAGVIT, to tackle various video synthesis tasks with a single model.

Ranked #1 on Video Prediction on Something-Something V2

Multi-Task Learning Text-to-Video Generation +2

842

Paper
Code

Visual Prompt Tuning for Generative Transfer Learning

1 code implementation • CVPR 2023 • Kihyuk Sohn, Yuan Hao, José Lezama, Luisa Polania, Huiwen Chang, Han Zhang, Irfan Essa, Lu Jiang

We base our framework on state-of-the-art generative vision transformers that represent an image as a sequence of visual tokens to the autoregressive or non-autoregressive transformers.

Image Generation Transfer Learning +1

Paper
Code

Rethinking Motion Deblurring Training: A Segmentation-Based Method for Simulating Non-Uniform Motion Blurred Images

1 code implementation • 26 Sep 2022 • Guillermo Carbajal, Patricia Vitoria, Pablo Musé, José Lezama

Successful training of end-to-end deep networks for real motion deblurring requires datasets of sharp/blurred image pairs that are realistic and diverse enough to achieve generalization to real blurred images.

Deblurring

Paper
Code

Improved Masked Image Generation with Token-Critic

1 code implementation • 9 Sep 2022 • José Lezama, Huiwen Chang, Lu Jiang, Irfan Essa

Given a masked-and-reconstructed real image, the Token-Critic model is trained to distinguish which visual tokens belong to the original image and which were sampled by the generative transformer.

Image Generation

711

Paper
Code

Non-uniform Blur Kernel Estimation via Adaptive Basis Decomposition

1 code implementation • 1 Feb 2021 • Guillermo Carbajal, Patricia Vitoria, Mauricio Delbracio, Pablo Musé, José Lezama

In recent years, the removal of motion blur in photographs has seen impressive progress in the hands of deep learning-based methods, trained to map directly from blurry to sharp images.

Deblurring Image Restoration

Paper
Code

Detecting Out-Of-Distribution Samples Using Low-Order Deep Features Statistics

no code implementations • ICLR 2019 • Igor M. Quintanilha, Roberto de M. E. Filho, José Lezama, Mauricio Delbracio, Leonardo O. Nunes

The ability to detect when an input sample was not drawn from the training distribution is an important desirable property of deep neural networks.

Benchmarking

Paper
Add Code

Overcoming the Disentanglement vs Reconstruction Trade-off via Jacobian Supervision

1 code implementation • ICLR 2019 • José Lezama

A major challenge in learning image representations is the disentangling of the factors of variation underlying the image formation.

Attribute Disentanglement +2

Paper
Code

Psychophysics, Gestalts and Games

no code implementations • 25 May 2018 • José Lezama, Samy Blusseau, Jean-Michel Morel, Gregory Randall, Rafael Grompone von Gioi

Using a computational quantitative version of the non-accidentalness principle, we raise the possibility that the psychophysical and the (older) gestaltist setups, both applicable on dot or Gabor patterns, find a useful complement in a Turing test.

Human Detection

Paper
Add Code

OLÉ: Orthogonal Low-rank Embedding, A Plug and Play Geometric Loss for Deep Learning

1 code implementation • 5 Dec 2017 • José Lezama, Qiang Qiu, Pablo Musé, Guillermo Sapiro

Deep neural networks trained using a softmax layer at the top and the cross-entropy loss are ubiquitous tools for image classification.

General Classification Metric Learning +2

Paper
Code

Cannot find the paper you are looking for? You can Submit a new open access paper.