Search Results for author: Zhaowen Wang

Found 56 papers, 22 papers with code

Multi-Modal Video Topic Segmentation with Dual-Contrastive Domain Adaptation

no code implementations30 Nov 2023 Linzi Xing, Quan Tran, Fabian Caba, Franck Dernoncourt, Seunghyun Yoon, Zhaowen Wang, Trung Bui, Giuseppe Carenini

Video topic segmentation unveils the coarse-grained semantic structure underlying videos and is essential for other video understanding tasks.

Contrastive Learning Segmentation +2

Improving Diffusion Models for Scene Text Editing with Dual Encoders

1 code implementation12 Apr 2023 Jiabao Ji, Guanhua Zhang, Zhaowen Wang, Bairu Hou, Zhifei Zhang, Brian Price, Shiyu Chang

Scene text editing is a challenging task that involves modifying or inserting specified texts in an image while maintaining its natural and realistic appearance.

Scene Text Editing Style Transfer

Moment Detection in Long Tutorial Videos

1 code implementation ICCV 2023 Ioana Croitoru, Simion-Vlad Bogolin, Samuel Albanie, Yang Liu, Zhaowen Wang, Seunghyun Yoon, Franck Dernoncourt, Hailin Jin, Trung Bui

To study this problem, we propose the first dataset of untrimmed, long-form tutorial videos for the task of Moment Detection called the Behance Moment Detection (BMD) dataset.

LiveSeg: Unsupervised Multimodal Temporal Segmentation of Long Livestream Videos

no code implementations12 Oct 2022 JieLin Qiu, Franck Dernoncourt, Trung Bui, Zhaowen Wang, Ding Zhao, Hailin Jin

Livestream videos have become a significant part of online learning, where design, digital marketing, creative painting, and other skills are taught by experienced experts in the sessions, making them valuable materials.

Marketing Segmentation

Semantics-Consistent Cross-domain Summarization via Optimal Transport Alignment

no code implementations10 Oct 2022 JieLin Qiu, Jiacheng Zhu, Mengdi Xu, Franck Dernoncourt, Trung Bui, Zhaowen Wang, Bo Li, Ding Zhao, Hailin Jin

Multimedia summarization with multimodal output (MSMO) is a recently explored application in language grounding.

Toward Understanding WordArt: Corner-Guided Transformer for Scene Text Recognition

1 code implementation31 Jul 2022 Xudong Xie, Ling Fu, Zhifei Zhang, Zhaowen Wang, Xiang Bai

Thirdly, we utilize Transformer to learn the global feature on image-level and model the global relationship of the corner points, with the assistance of a corner-query cross-attention mechanism.

Scene Text Recognition

MHMS: Multimodal Hierarchical Multimedia Summarization

no code implementations7 Apr 2022 JieLin Qiu, Jiacheng Zhu, Mengdi Xu, Franck Dernoncourt, Trung Bui, Zhaowen Wang, Bo Li, Ding Zhao, Hailin Jin

Multimedia summarization with multimodal output can play an essential role in real-world applications, i. e., automatically generating cover images and titles for news articles or providing introductions to online videos.

STALP: Style Transfer with Auxiliary Limited Pairing

no code implementations20 Oct 2021 David Futschik, Michal Kučera, Michal Lukáč, Zhaowen Wang, Eli Shechtman, Daniel Sýkora

We present an approach to example-based stylization of images that uses a single pair of a source image and its stylized counterpart.

Style Transfer Translation

Font Completion and Manipulation by Cycling Between Multi-Modality Representations

1 code implementation30 Aug 2021 Ye Yuan, Wuyang Chen, Zhaowen Wang, Matthew Fisher, Zhifei Zhang, Zhangyang Wang, Hailin Jin

The novel graph constructor maps a glyph's latent code to its graph representation that matches expert knowledge, which is trained to help the translation task.

Image-to-Image Translation Representation Learning +2

Black-Box Diagnosis and Calibration on GAN Intra-Mode Collapse: A Pilot Study

1 code implementation23 Jul 2021 Zhenyu Wu, Zhaowen Wang, Ye Yuan, Jianming Zhang, Zhangyang Wang, Hailin Jin

Existing diversity tests of samples from GANs are usually conducted qualitatively on a small scale, and/or depends on the access to original training data as well as the trained model parameters.

Image Generation

Rethinking Text Segmentation: A Novel Dataset and A Text-Specific Refinement Approach

1 code implementation CVPR 2021 Xingqian Xu, Zhifei Zhang, Zhaowen Wang, Brian Price, Zhonghao Wang, Humphrey Shi

We also introduce Text Refinement Network (TexRNet), a novel text segmentation approach that adapts to the unique properties of text, e. g. non-convex boundary, diverse texture, etc., which often impose burdens on traditional segmentation models.

Segmentation Style Transfer +2

G-DARTS-A: Groups of Channel Parallel Sampling with Attention

no code implementations16 Oct 2020 Zhaowen Wang, Wei zhang, Zhiming Wang

Differentiable Architecture Search (DARTS) provides a baseline for searching effective network architectures based gradient, but it is accompanied by huge computational overhead in searching and training network architecture.

Texture Hallucination for Large-Factor Painting Super-Resolution

no code implementations ECCV 2020 Yulun Zhang, Zhifei Zhang, Stephen DiVerdi, Zhaowen Wang, Jose Echevarria, Yun Fu

We aim to super-resolve digital paintings, synthesizing realistic details from high-resolution reference painting materials for very large scaling factors (e. g., 8X, 16X).

Image Reconstruction Image Super-Resolution

An Internal Learning Approach to Video Inpainting

1 code implementation ICCV 2019 Haotian Zhang, Long Mai, Ning Xu, Zhaowen Wang, John Collomosse, Hailin Jin

We propose a novel video inpainting algorithm that simultaneously hallucinates missing appearance and motion (optical flow) information, building upon the recent 'Deep Image Prior' (DIP) that exploits convolutional network architectures to enforce plausible texture in static images.

Optical Flow Estimation Video Inpainting

Large-scale Tag-based Font Retrieval with Generative Feature Learning

no code implementations ICCV 2019 Tianlang Chen, Zhaowen Wang, Ning Xu, Hailin Jin, Jiebo Luo

In this paper, we address the problem of large-scale tag-based font retrieval which aims to bring semantics to the font selection process and enable people without expert knowledge to use fonts effectively.

Retrieval TAG

Privacy-Preserving Deep Action Recognition: An Adversarial Learning Framework and A New Dataset

5 code implementations12 Jun 2019 Zhen-Yu Wu, Haotao Wang, Zhaowen Wang, Hailin Jin, Zhangyang Wang

We first discuss an innovative heuristic of cross-dataset training and evaluation, enabling the use of multiple single-task datasets (one with target task labels and the other with privacy labels) in our problem.

Action Recognition Privacy Preserving +1

Controllable Artistic Text Style Transfer via Shape-Matching GAN

1 code implementation ICCV 2019 Shuai Yang, Zhangyang Wang, Zhaowen Wang, Ning Xu, Jiaying Liu, Zongming Guo

In this paper, we present the first text style transfer network that allows for real-time control of the crucial stylistic degree of the glyph through an adjustable parameter.

Style Transfer Text Style Transfer

Multimodal Style Transfer via Graph Cuts

2 code implementations ICCV 2019 Yulun Zhang, Chen Fang, Yilin Wang, Zhaowen Wang, Zhe Lin, Yun Fu, Jimei Yang

An assumption widely used in recent neural style transfer methods is that image styles can be described by global statics of deep features like Gram or covariance matrices.

Style Transfer

Dance Dance Generation: Motion Transfer for Internet Videos

no code implementations30 Mar 2019 Yipin Zhou, Zhaowen Wang, Chen Fang, Trung Bui, Tamara L. Berg

This work presents computational methods for transferring body movements from one person to another with videos collected in the wild.


Image Super-Resolution by Neural Texture Transfer

2 code implementations CVPR 2019 Zhifei Zhang, Zhaowen Wang, Zhe Lin, Hairong Qi

Reference-based super-resolution (RefSR), on the other hand, has proven to be promising in recovering high-resolution (HR) details when a reference (Ref) image with similar content as that of the LR input is given.

Image Stylization Image Super-Resolution +1

Visual Font Pairing

no code implementations19 Nov 2018 Shuhui Jiang, Zhaowen Wang, Aaron Hertzmann, Hailin Jin, Yun Fu

Third, font pairing is an asymmetric problem in that the roles played by header and body fonts are not interchangeable.

Metric Learning

``Factual'' or ``Emotional'': Stylized Image Captioning with Adaptive Learning and Attention

no code implementations ECCV 2018 Tianlang Chen, Zhongping Zhang, Quanzeng You, Chen Fang, Zhaowen Wang, Hailin Jin, Jiebo Luo

It uses two groups of matrices to capture the factual and stylized knowledge, respectively, and automatically learns the word-level weights of the two groups based on previous context.

Image Captioning

Flow-Grounded Spatial-Temporal Video Prediction from Still Images

1 code implementation ECCV 2018 Yijun Li, Chen Fang, Jimei Yang, Zhaowen Wang, Xin Lu, Ming-Hsuan Yang

Existing video prediction methods mainly rely on observing multiple historical frames or focus on predicting the next one-frame.

Video Prediction

Towards Privacy-Preserving Visual Recognition via Adversarial Training: A Pilot Study

3 code implementations ECCV 2018 Zhen-Yu Wu, Zhangyang Wang, Zhaowen Wang, Hailin Jin

This paper aims to improve privacy-preserving visual recognition, an increasingly demanded feature in smart camera applications, by formulating a unique adversarial training framework.

Action Recognition Privacy Preserving +1

"Factual" or "Emotional": Stylized Image Captioning with Adaptive Learning and Attention

no code implementations10 Jul 2018 Tianlang Chen, Zhongping Zhang, Quanzeng You, Chen Fang, Zhaowen Wang, Hailin Jin, Jiebo Luo

It uses two groups of matrices to capture the factual and stylized knowledge, respectively, and automatically learns the word-level weights of the two groups based on previous context.

Image Captioning

Re-Weighted Adversarial Adaptation Network for Unsupervised Domain Adaptation

no code implementations CVPR 2018 Qingchao Chen, Yang Liu, Zhaowen Wang, Ian Wassell, Kevin Chetty

In this paper, we propose the Re-weighted Adversarial Adaptation Network (RAAN) to reduce the feature distribution divergence and adapt the classifier when domain discrepancies are disparate.

Open-Ended Question Answering Unsupervised Domain Adaptation

Multi-Task Adversarial Network for Disentangled Feature Learning

no code implementations CVPR 2018 Yang Liu, Zhaowen Wang, Hailin Jin, Ian Wassell

The encoder and the discriminators are trained cooperatively on factors of interest, but in an adversarial way on factors of distraction.

Face Recognition Font Recognition +1

Reference-Conditioned Super-Resolution by Neural Texture Transfer

no code implementations10 Apr 2018 Zhifei Zhang, Zhaowen Wang, Zhe Lin, Hairong Qi

We focus on transferring the high-resolution texture from reference images to the super-resolution process without the constraint of content similarity between reference and target images, which is a key difference from previous example-based methods.

Image Stylization Image Super-Resolution +1

Exploring Asymmetric Encoder-Decoder Structure for Context-based Sentence Representation Learning

no code implementations ICLR 2018 Shuai Tang, Hailin Jin, Chen Fang, Zhaowen Wang, Virginia R. de Sa

Context information plays an important role in human language understanding, and it is also useful for machines to learn vector representations of language.

Representation Learning

Visual to Sound: Generating Natural Sound for Videos in the Wild

3 code implementations CVPR 2018 Yipin Zhou, Zhaowen Wang, Chen Fang, Trung Bui, Tamara L. Berg

As two of the five traditional human senses (sight, hearing, taste, smell, and touch), vision and sound are basic sources through which humans understand the world.

Multi-Content GAN for Few-Shot Font Style Transfer

6 code implementations CVPR 2018 Samaneh Azadi, Matthew Fisher, Vladimir Kim, Zhaowen Wang, Eli Shechtman, Trevor Darrell

In this work, we focus on the challenge of taking partial observations of highly-stylized text and generalizing the observations to generate unobserved glyphs in the ornamented typeface.

Font Style Transfer

Visually-Aware Fashion Recommendation and Design with Generative Image Models

no code implementations7 Nov 2017 Wang-Cheng Kang, Chen Fang, Zhaowen Wang, Julian McAuley

Here, we seek to extend this contribution by showing that recommendation performance can be significantly improved by learning `fashion aware' image representations directly, i. e., by training the image representation (from the pixel level) and the recommender system jointly; this contribution is related to recent work using Siamese CNNs, though we are able to show improvements over state-of-the-art recommendation techniques such as BPR and variants that make use of pre-trained visual features.

Recommendation Systems

Robust Video Super-Resolution With Learned Temporal Dynamics

no code implementations ICCV 2017 Ding Liu, Zhaowen Wang, Yuchen Fan, Xian-Ming Liu, Zhangyang Wang, Shiyu Chang, Thomas Huang

Second, we reduce the complexity of motion between neighboring frames using a spatial alignment network that is much more robust and efficient than competing alignment methods and can be jointly trained with the temporal adaptive network in an end-to-end manner.

Video Super-Resolution

Robust Lane Tracking with Multi-mode Observation Model and Particle Filtering

no code implementations28 Jun 2017 Jiawei Huang, Zhaowen Wang

Automatic lane tracking involves estimating the underlying signal from a sequence of noisy signal observations.

Trimming and Improving Skip-thought Vectors

no code implementations9 Jun 2017 Shuai Tang, Hailin Jin, Chen Fang, Zhaowen Wang, Virginia R. de Sa

The skip-thought model has been proven to be effective at learning sentence representations and capturing sentence semantics.

text-classification Text Classification

Rethinking Skip-thought: A Neighborhood based Approach

no code implementations WS 2017 Shuai Tang, Hailin Jin, Chen Fang, Zhaowen Wang, Virginia R. de Sa

We train our skip-thought neighbor model on a large corpus with continuous sentences, and then evaluate the trained model on 7 tasks, which include semantic relatedness, paraphrase detection, and classification benchmarks.

General Classification

Universal Style Transfer via Feature Transforms

14 code implementations NeurIPS 2017 Yijun Li, Chen Fang, Jimei Yang, Zhaowen Wang, Xin Lu, Ming-Hsuan Yang

The whitening and coloring transforms reflect a direct matching of feature covariance of the content image to a given style image, which shares similar spirits with the optimization of Gram matrix based cost in neural style transfer.

Image Reconstruction Style Transfer

AMC: Attention guided Multi-modal Correlation Learning for Image Search

2 code implementations CVPR 2017 Kan Chen, Trung Bui, Fang Chen, Zhaowen Wang, Ram Nevatia

According to the intent of query, attention mechanism can be introduced to adaptively balance the importance of different modalities.

Image Retrieval

Diversified Texture Synthesis with Feed-forward Networks

no code implementations CVPR 2017 Yijun Li, Chen Fang, Jimei Yang, Zhaowen Wang, Xin Lu, Ming-Hsuan Yang

Recent progresses on deep discriminative and generative modeling have shown promising results on texture synthesis.

Texture Synthesis

Learning a Mixture of Deep Networks for Single Image Super-Resolution

no code implementations3 Jan 2017 Ding Liu, Zhaowen Wang, Nasser Nasrabadi, Thomas Huang

This paper proposes the method of learning a mixture of SR inference modules in a unified framework to tackle this problem.

Image Super-Resolution

Vista: A Visually, Socially, and Temporally-aware Model for Artistic Recommendation

no code implementations15 Jul 2016 Ruining He, Chen Fang, Zhaowen Wang, Julian McAuley

Understanding users' interactions with highly subjective content---like artistic images---is challenging due to the complex semantics that guide our preferences.

Recommendation Systems

Robust Single Image Super-Resolution via Deep Networks With Sparse Prior

1 code implementation journals 2016 Ding Liu, Zhaowen Wang, Bihan Wen, Student Member, Jianchao Yang, Member, Wei Han, and Thomas S. Huang, Fellow, IEEE

We demonstrate that a sparse coding model particularly designed for SR can be incarnated as a neural network with the merit of end-to-end optimization over training data.

Image Super-Resolution

Image Captioning with Semantic Attention

no code implementations CVPR 2016 Quanzeng You, Hailin Jin, Zhaowen Wang, Chen Fang, Jiebo Luo

Automatically generating a natural language description of an image has attracted interests recently both because of its importance in practical applications and because it connects two major artificial intelligence fields: computer vision and natural language processing.

Image Captioning

Deep Networks for Image Super-Resolution with Sparse Prior

no code implementations ICCV 2015 Zhaowen Wang, Ding Liu, Jianchao Yang, Wei Han, Thomas Huang

We show that a sparse coding model particularly designed for super-resolution can be incarnated as a neural network, and trained in a cascaded structure from end to end.

Image Restoration Image Super-Resolution

Learning Super-Resolution Jointly from External and Internal Examples

no code implementations3 Mar 2015 Zhangyang Wang, Yingzhen Yang, Zhaowen Wang, Shiyu Chang, Jianchao Yang, Thomas S. Huang

Single image super-resolution (SR) aims to estimate a high-resolution (HR) image from a lowresolution (LR) input.

Image Super-Resolution

Scalable Similarity Learning using Large Margin Neighborhood Embedding

no code implementations24 Apr 2014 Zhaowen Wang, Jianchao Yang, Zhe Lin, Jonathan Brandt, Shiyu Chang, Thomas Huang

In this paper, we present an image similarity learning method that can scale well in both the number of images and the dimensionality of image descriptors.

Metric Learning

Cannot find the paper you are looking for? You can Submit a new open access paper.