Search Results for author: Yu-Chuan Su

Found 20 papers, 1 papers with code

Instruct-Imagen: Image Generation with Multi-modal Instruction

no code implementations3 Jan 2024 Hexiang Hu, Kelvin C. K. Chan, Yu-Chuan Su, Wenhu Chen, Yandong Li, Kihyuk Sohn, Yang Zhao, Xue Ben, Boqing Gong, William Cohen, Ming-Wei Chang, Xuhui Jia

We introduce *multi-modal instruction* for image generation, a task representation articulating a range of generation intents with precision.

Image Generation Retrieval

Fine-grained Controllable Video Generation via Object Appearance and Context

no code implementations5 Dec 2023 Hsin-Ping Huang, Yu-Chuan Su, Deqing Sun, Lu Jiang, Xuhui Jia, Yukun Zhu, Ming-Hsuan Yang

To achieve detailed control, we propose a unified framework to jointly inject control signals into the existing text-to-video model.

Text-to-Video Generation Video Generation

Towards Authentic Face Restoration with Iterative Diffusion Models and Beyond

no code implementations ICCV 2023 Yang Zhao, Tingbo Hou, Yu-Chuan Su, Xuhui Jia. Yandong Li, Matthias Grundmann

An authentic face restoration system is becoming increasingly demanding in many computer vision applications, e. g., image enhancement, video communication, and taking portrait.

Blind Face Restoration Denoising +2

Controllable One-Shot Face Video Synthesis With Semantic Aware Prior

no code implementations27 Apr 2023 Kangning Liu, Yu-Chuan Su, Wei, Hong, Ruijin Cang, Xuhui Jia

The one-shot talking-head synthesis task aims to animate a source image to another pose and expression, which is dictated by a driving frame.

Video Generation Beyond a Single Clip

no code implementations15 Apr 2023 Hsin-Ping Huang, Yu-Chuan Su, Ming-Hsuan Yang

We tackle the long video generation problem, i. e.~generating videos beyond the output length of video generation models.

Video Generation

Identity Encoder for Personalized Diffusion

no code implementations14 Apr 2023 Yu-Chuan Su, Kelvin C. K. Chan, Yandong Li, Yang Zhao, Han Zhang, Boqing Gong, Huisheng Wang, Xuhui Jia

Our approach greatly reduces the overhead for personalized image generation and is more applicable in many potential applications.

Image Enhancement Image Generation

Rethinking Deep Face Restoration

no code implementations CVPR 2022 Yang Zhao, Yu-Chuan Su, Chun-Te Chu, Yandong Li, Marius Renn, Yukun Zhu, Changyou Chen, Xuhui Jia

While existing approaches for face restoration make significant progress in generating high-quality faces, they often fail to preserve facial features and cannot authentically reconstruct the faces.

Face Generation Face Reconstruction

2.5D Visual Relationship Detection

1 code implementation26 Apr 2021 Yu-Chuan Su, Soravit Changpinyo, Xiangning Chen, Sathish Thoppay, Cho-Jui Hsieh, Lior Shapira, Radu Soricut, Hartwig Adam, Matthew Brown, Ming-Hsuan Yang, Boqing Gong

To enable progress on this task, we create a new dataset consisting of 220k human-annotated 2. 5D relationships among 512K objects from 11K images.

Benchmarking Depth Estimation +2

Camera View Adjustment Prediction for Improving Image Composition

no code implementations15 Apr 2021 Yu-Chuan Su, Raviteja Vemulapalli, Ben Weiss, Chun-Te Chu, Philip Andrew Mansfield, Lior Shapira, Colvin Pitts

To address this issue, we propose a deep learning-based approach that provides suggestions to the photographer on how to adjust the camera view before capturing.

Image Cropping

Kernel Transformer Networks for Compact Spherical Convolution

no code implementations CVPR 2019 Yu-Chuan Su, Kristen Grauman

KTNs efficiently transfer convolution kernels from perspective images to the equirectangular projection of 360{\deg} images.

Learning Compressible 360° Video Isomers

no code implementations CVPR 2018 Yu-Chuan Su, Kristen Grauman

Standard video encoders developed for conventional narrow field-of-view video are widely applied to 360° video as well, with reasonable results.

Learning Compressible 360° Video Isomers

no code implementations12 Dec 2017 Yu-Chuan Su, Kristen Grauman

Standard video encoders developed for conventional narrow field-of-view video are widely applied to 360{\deg} video as well, with reasonable results.

Learning Spherical Convolution for Fast Features from 360° Imagery

no code implementations NeurIPS 2017 Yu-Chuan Su, Kristen Grauman

While 360{\deg} cameras offer tremendous new possibilities in vision, graphics, and augmented reality, the spherical images they produce make core feature extraction non-trivial.

Making 360$^{\circ}$ Video Watchable in 2D: Learning Videography for Click Free Viewing

no code implementations1 Mar 2017 Yu-Chuan Su, Kristen Grauman

360$^{\circ}$ video requires human viewers to actively control "where" to look while watching the video.

Navigate

Pano2Vid: Automatic Cinematography for Watching 360$^{\circ}$ Videos

no code implementations7 Dec 2016 Yu-Chuan Su, Dinesh Jayaraman, Kristen Grauman

AutoCam leverages NFOV web video to discriminatively identify space-time "glimpses" of interest at each time instant, and then uses dynamic programming to select optimal human-like camera trajectories.

Detecting Engagement in Egocentric Video

no code implementations4 Apr 2016 Yu-Chuan Su, Kristen Grauman

In a wearable camera video, we see what the camera wearer sees.

Video Summarization

Leaving Some Stones Unturned: Dynamic Feature Prioritization for Activity Detection in Streaming Video

no code implementations1 Apr 2016 Yu-Chuan Su, Kristen Grauman

Current approaches for activity recognition often ignore constraints on computational resources: 1) they rely on extensive feature computation to obtain rich descriptors on all frames, and 2) they assume batch-mode access to the entire test video at once.

Action Detection Activity Detection +2

Transfer Learning for Video Recognition with Scarce Training Data for Deep Convolutional Neural Network

no code implementations15 Sep 2014 Yu-Chuan Su, Tzu-Hsuan Chiu, Chun-Yen Yeh, Hsin-Fu Huang, Winston H. Hsu

The same lack-of-training-sample problem limits the usage of deep models on a wide range of computer vision problems where obtaining training data are difficult.

4k Transfer Learning +1

Cannot find the paper you are looking for? You can Submit a new open access paper.