Search Results for author: Wan-Cyuan Fan

Found 10 papers, 2 papers with code

TAM-VT: Transformation-Aware Multi-scale Video Transformer for Segmentation and Tracking

no code implementations13 Dec 2023 Raghav Goyal, Wan-Cyuan Fan, Mennatullah Siam, Leonid Sigal

In this work we propose a novel, clip-based DETR-style encoder-decoder architecture, which focuses on systematically analyzing and addressing aforementioned challenges.

Semantic Segmentation Video Object Segmentation +1

Target-Free Text-guided Image Manipulation

no code implementations26 Nov 2022 Wan-Cyuan Fan, Cheng-Fu Yang, Chiao-An Yang, Yu-Chiang Frank Wang

We tackle the problem of target-free text-guided image manipulation, which requires one to modify the input reference image based on the given text instruction, while no ground truth target image is observed during training.

counterfactual Image Manipulation

Paraphrasing Is All You Need for Novel Object Captioning

no code implementations25 Sep 2022 Cheng-Fu Yang, Yao-Hung Hubert Tsai, Wan-Cyuan Fan, Ruslan Salakhutdinov, Louis-Philippe Morency, Yu-Chiang Frank Wang

Since no ground truth captions are available for novel object images during training, our P2C leverages cross-modality (image-text) association modules to ensure the above caption characteristics can be properly preserved.

Language Modelling Object

Scene Graph Expansion for Semantics-Guided Image Outpainting

no code implementations CVPR 2022 Chiao-An Yang, Cheng-Yo Tan, Wan-Cyuan Fan, Cheng-Fu Yang, Meng-Lin Wu, Yu-Chiang Frank Wang

In particular, we propose a novel network of Scene Graph Transformer (SGT), which is designed to take node and edge features as inputs for modeling the associated structural information.

Image Outpainting

Learning Visual-Linguistic Adequacy, Fidelity, and Fluency for Novel Object Captioning

no code implementations29 Sep 2021 Cheng-Fu Yang, Yao-Hung Hubert Tsai, Wan-Cyuan Fan, Yu-Chiang Frank Wang, Louis-Philippe Morency, Ruslan Salakhutdinov

Novel object captioning (NOC) learns image captioning models for describing objects or visual concepts which are unseen (i. e., novel) in the training captions.

Image Captioning

LayoutTransformer: Scene Layout Generation With Conceptual and Spatial Diversity

1 code implementation CVPR 2021 Cheng-Fu Yang, Wan-Cyuan Fan, Fu-En Yang, Yu-Chiang Frank Wang

To better exploit the text input, so that implicit objects or relationships can be properly inferred during layout generation, we propose a LayoutTransformer Network (LT-Net) in this paper.

LayoutTransformer: Relation-Aware Scene Layout Generation

no code implementations1 Jan 2021 Cheng-Fu Yang, Wan-Cyuan Fan, Fu-En Yang, Yu-Chiang Frank Wang

In the areas of machine learning and computer vision, text-to-image synthesis aims at producing image outputs given the input text.

Image Generation Object +1

Natural World Distribution via Adaptive Confusion Energy Regularization

no code implementations1 Jan 2021 Yen-Chi Hsu, Cheng-Yao Hong, Wan-Cyuan Fan, Ding-Jie Chen, Ming-Sui Lee, Davi Geiger, Tyng-Luh Liu

The Fine-Grained Visual Classification (FGVC) problem is notably characterized by two intriguing properties, significant inter-class similarity and intra-class variations, which cause learning an effective FGVC classifier a challenging task.

Fine-Grained Image Classification

Cannot find the paper you are looking for? You can Submit a new open access paper.