Search Results for author: Sharon X. Huang

Found 8 papers, 5 papers with code

TI2V-Zero: Zero-Shot Image Conditioning for Text-to-Video Diffusion Models

no code implementations25 Apr 2024 Haomiao Ni, Bernhard Egger, Suhas Lohit, Anoop Cherian, Ye Wang, Toshiaki Koike-Akino, Sharon X. Huang, Tim K. Marks

To guide video generation with the additional image input, we propose a "repeat-and-slide" strategy that modulates the reverse denoising process, allowing the frozen diffusion model to synthesize a video frame-by-frame starting from the provided image.

ANCHOR: LLM-driven News Subject Conditioning for Text-to-Image Synthesis

1 code implementation15 Apr 2024 Aashish Anantha Ramakrishnan, Sharon X. Huang, Dongwon Lee

With Large Language Models (LLM) achieving success in language and commonsense reasoning tasks, we explore the ability of different LLMs to identify and understand key subjects from abstractive captions.

Descriptive Image Captioning +2

3D-Aware Talking-Head Video Motion Transfer

no code implementations5 Nov 2023 Haomiao Ni, Jiachen Liu, Yuan Xue, Sharon X. Huang

In this paper, we propose a novel 3D-aware talking-head video motion transfer network, Head3D, which fully exploits the subject appearance information by generating a visually-interpretable 3D canonical head from the 2D subject frames with a recurrent network.

Novel View Synthesis

NeRF-Enhanced Outpainting for Faithful Field-of-View Extrapolation

no code implementations23 Sep 2023 Rui Yu, Jiachen Liu, Zihan Zhou, Sharon X. Huang

In various applications, such as robotic navigation and remote visual assistance, expanding the field of view (FOV) of the camera proves beneficial for enhancing environmental perception.

Image Outpainting

Synthetic Augmentation with Large-scale Unconditional Pre-training

1 code implementation8 Aug 2023 Jiarong Ye, Haomiao Ni, Peng Jin, Sharon X. Huang, Yuan Xue

To further reduce the dependency on annotated data, we propose a synthetic augmentation method called HistoDiffusion, which can be pre-trained on large-scale unlabeled datasets and later applied to a small-scale labeled dataset for augmented training.

Conditional Image-to-Video Generation with Latent Flow Diffusion Models

1 code implementation CVPR 2023 Haomiao Ni, Changhao Shi, Kai Li, Sharon X. Huang, Martin Renqiang Min

In this paper, we propose an approach for cI2V using novel latent flow diffusion models (LFDM) that synthesize an optical flow sequence in the latent space based on the given condition to warp the given image.

Image to Video Generation Optical Flow Estimation

ANNA: Abstractive Text-to-Image Synthesis with Filtered News Captions

1 code implementation5 Jan 2023 Aashish Anantha Ramakrishnan, Sharon X. Huang, Dongwon Lee

Advancements in Text-to-Image synthesis over recent years have focused more on improving the quality of generated samples on datasets with descriptive captions.

Benchmarking Descriptive +2

Cross-identity Video Motion Retargeting with Joint Transformation and Synthesis

1 code implementation2 Oct 2022 Haomiao Ni, Yihao Liu, Sharon X. Huang, Yuan Xue

The novel design of dual branches combines the strengths of deformation-grid-based transformation and warp-free generation for better identity preservation and robustness to occlusion in the synthesized videos.

motion retargeting

Cannot find the paper you are looking for? You can Submit a new open access paper.