Search Results for author: Hailin Jin

Found 62 papers, 17 papers with code

MHMS: Multimodal Hierarchical Multimedia Summarization

no code implementations7 Apr 2022 JieLin Qiu, Jiacheng Zhu, Mengdi Xu, Franck Dernoncourt, Trung Bui, Zhaowen Wang, Bo Li, Ding Zhao, Hailin Jin

Multimedia summarization with multimodal output can play an essential role in real-world applications, i. e., automatically generating cover images and titles for news articles or providing introductions to online videos.

StyleBabel: Artistic Style Tagging and Captioning

no code implementations10 Mar 2022 Dan Ruta, Andrew Gilbert, Pranav Aggarwal, Naveen Marri, Ajinkya Kale, Jo Briggs, Chris Speed, Hailin Jin, Baldo Faieta, Alex Filipkowski, Zhe Lin, John Collomosse

We present StyleBabel, a unique open access dataset of natural language captions and free-form tags describing the artistic style of over 135K digital artworks, collected via a novel participatory method from experts studying at specialist art and design schools.

Representation Learning TAG

Cross Modal Retrieval with Querybank Normalisation

1 code implementation23 Dec 2021 Simion-Vlad Bogolin, Ioana Croitoru, Hailin Jin, Yang Liu, Samuel Albanie

In this work we first show that, despite their effectiveness, state-of-the-art joint embeddings suffer significantly from the longstanding "hubness problem" in which a small number of gallery embeddings form the nearest neighbours of many queries.

Cross-Modal Retrieval Metric Learning +2

Time-Equivariant Contrastive Video Representation Learning

no code implementations ICCV 2021 Simon Jenni, Hailin Jin

We introduce a novel self-supervised contrastive learning method to learn representations from unlabelled videos.

Action Recognition Contrastive Learning +2

Look at What I’m Doing: Self-Supervised Spatial Grounding of Narrations in Instructional Videos

no code implementations NeurIPS 2021 Reuben Tan, Bryan Plummer, Kate Saenko, Hailin Jin, Bryan Russell

Key to our approach is the ability to learn to spatially localize interactions with self-supervision on a large corpus of videos with accompanying transcribed narrations.

Look at What I'm Doing: Self-Supervised Spatial Grounding of Narrations in Instructional Videos

no code implementations20 Oct 2021 Reuben Tan, Bryan A. Plummer, Kate Saenko, Hailin Jin, Bryan Russell

Key to our approach is the ability to learn to spatially localize interactions with self-supervision on a large corpus of videos with accompanying transcribed narrations.

StreamHover: Livestream Transcript Summarization and Annotation

1 code implementation EMNLP 2021 Sangwoo Cho, Franck Dernoncourt, Tim Ganter, Trung Bui, Nedim Lipka, Walter Chang, Hailin Jin, Jonathan Brandt, Hassan Foroosh, Fei Liu

With the explosive growth of livestream broadcasting, there is an urgent need for new summarization technology that enables us to create a preview of streamed content and tap into this wealth of knowledge.

Extractive Summarization

Font Completion and Manipulation by Cycling Between Multi-Modality Representations

1 code implementation30 Aug 2021 Ye Yuan, Wuyang Chen, Zhaowen Wang, Matthew Fisher, Zhifei Zhang, Zhangyang Wang, Hailin Jin

The novel graph constructor maps a glyph's latent code to its graph representation that matches expert knowledge, which is trained to help the translation task.

Image-to-Image Translation Representation Learning +2

Black-Box Diagnosis and Calibration on GAN Intra-Mode Collapse: A Pilot Study

1 code implementation23 Jul 2021 Zhenyu Wu, Zhaowen Wang, Ye Yuan, Jianming Zhang, Zhangyang Wang, Hailin Jin

Existing diversity tests of samples from GANs are usually conducted qualitatively on a small scale, and/or depends on the access to original training data as well as the trained model parameters.

Image Generation

Cross-Sentence Temporal and Semantic Relations in Video Activity Localisation

no code implementations ICCV 2021 Jiabo Huang, Yang Liu, Shaogang Gong, Hailin Jin

Video activity localisation has recently attained increasing attention due to its practical values in automatically localising the most salient visual segments corresponding to their language descriptions (sentences) from untrimmed and unstructured videos.

Compositional Sketch Search

1 code implementation15 Jun 2021 Alexander Black, Tu Bui, Long Mai, Hailin Jin, John Collomosse

We present an algorithm for searching image collections using free-hand sketches that describe the appearance and relative positions of multiple objects.

Quantization Sketch-Based Image Retrieval

Magic Layouts: Structural Prior for Component Detection in User Interface Designs

no code implementations CVPR 2021 Dipu Manandhar, Hailin Jin, John Collomosse

We present Magic Layouts; a method for parsing screenshots or hand-drawn sketches of user interface (UI) layouts.

TEACHTEXT: CrossModal Generalized Distillation for Text-Video Retrieval

1 code implementation ICCV 2021 Ioana Croitoru, Simion-Vlad Bogolin, Marius Leordeanu, Hailin Jin, Andrew Zisserman, Samuel Albanie, Yang Liu

In recent years, considerable progress on the task of text-video retrieval has been achieved by leveraging large-scale pretraining on visual and audio datasets to construct powerful video encoders.

Video Retrieval

Steering Self-Supervised Feature Learning Beyond Local Pixel Statistics

no code implementations CVPR 2020 Simon Jenni, Hailin Jin, Paolo Favaro

Based on this criterion, we introduce a novel image transformation that we call limited context inpainting (LCI).

Superpixel Segmentation with Fully Convolutional Networks

2 code implementations CVPR 2020 Fengting Yang, Qian Sun, Hailin Jin, Zihan Zhou

In computer vision, superpixels have been widely used as an effective way to reduce the number of image primitives for subsequent processing.

Disparity Estimation Stereo Matching +1

Neural Architecture Search for Deep Image Prior

2 code implementations14 Jan 2020 Kary Ho, Andrew Gilbert, Hailin Jin, John Collomosse

We present a neural architecture search (NAS) technique to enhance the performance of unsupervised image de-noising, in-painting and super-resolution under the recently proposed Deep Image Prior (DIP).

Image Restoration Neural Architecture Search +1

An Internal Learning Approach to Video Inpainting

1 code implementation ICCV 2019 Haotian Zhang, Long Mai, Ning Xu, Zhaowen Wang, John Collomosse, Hailin Jin

We propose a novel video inpainting algorithm that simultaneously hallucinates missing appearance and motion (optical flow) information, building upon the recent 'Deep Image Prior' (DIP) that exploits convolutional network architectures to enforce plausible texture in static images.

Optical Flow Estimation Video Inpainting

Large-scale Tag-based Font Retrieval with Generative Feature Learning

no code implementations ICCV 2019 Tianlang Chen, Zhaowen Wang, Ning Xu, Hailin Jin, Jiebo Luo

In this paper, we address the problem of large-scale tag-based font retrieval which aims to bring semantics to the font selection process and enable people without expert knowledge to use fonts effectively.

TAG

Privacy-Preserving Deep Action Recognition: An Adversarial Learning Framework and A New Dataset

5 code implementations12 Jun 2019 Zhen-Yu Wu, Haotao Wang, Zhaowen Wang, Hailin Jin, Zhangyang Wang

We first discuss an innovative heuristic of cross-dataset training and evaluation, enabling the use of multiple single-task datasets (one with target task labels and the other with privacy labels) in our problem.

Action Recognition Privacy Preserving Deep Learning

Learning Video Representations from Correspondence Proposals

2 code implementations CVPR 2019 Xingyu Liu, Joon-Young Lee, Hailin Jin

In particular, it can effectively learn representations for videos by mixing appearance and long-range motion with an RGB-only input.

Action Recognition

LiveSketch: Query Perturbations for Guided Sketch-based Visual Search

no code implementations CVPR 2019 John Collomosse, Tu Bui, Hailin Jin

LiveSketch is a novel algorithm for searching large image collections using hand-sketched queries.

Visual Font Pairing

no code implementations19 Nov 2018 Shuhui Jiang, Zhaowen Wang, Aaron Hertzmann, Hailin Jin, Yun Fu

Third, font pairing is an asymmetric problem in that the roles played by header and body fonts are not interchangeable.

Metric Learning

``Factual'' or ``Emotional'': Stylized Image Captioning with Adaptive Learning and Attention

no code implementations ECCV 2018 Tianlang Chen, Zhongping Zhang, Quanzeng You, Chen Fang, Zhaowen Wang, Hailin Jin, Jiebo Luo

It uses two groups of matrices to capture the factual and stylized knowledge, respectively, and automatically learns the word-level weights of the two groups based on previous context.

Image Captioning

Interactive Boundary Prediction for Object Selection

no code implementations ECCV 2018 Hoang Le, Long Mai, Brian Price, Scott Cohen, Hailin Jin, Feng Liu

Instead of relying on pre-defined low-level image features, our method adaptively predicts object boundaries according to image content and user interactions.

Interactive Segmentation Semantic Segmentation

Towards Privacy-Preserving Visual Recognition via Adversarial Training: A Pilot Study

3 code implementations ECCV 2018 Zhen-Yu Wu, Zhangyang Wang, Zhaowen Wang, Hailin Jin

This paper aims to improve privacy-preserving visual recognition, an increasingly demanded feature in smart camera applications, by formulating a unique adversarial training framework.

Action Recognition

"Factual" or "Emotional": Stylized Image Captioning with Adaptive Learning and Attention

no code implementations10 Jul 2018 Tianlang Chen, Zhongping Zhang, Quanzeng You, Chen Fang, Zhaowen Wang, Hailin Jin, Jiebo Luo

It uses two groups of matrices to capture the factual and stylized knowledge, respectively, and automatically learns the word-level weights of the two groups based on previous context.

Image Captioning

Disentangling Structure and Aesthetics for Style-Aware Image Completion

no code implementations CVPR 2018 Andrew Gilbert, John Collomosse, Hailin Jin, Brian Price

Content-aware image completion or in-painting is a fundamental tool for the correction of defects or removal of objects in images.

Multi-Task Adversarial Network for Disentangled Feature Learning

no code implementations CVPR 2018 Yang Liu, Zhaowen Wang, Hailin Jin, Ian Wassell

The encoder and the discriminators are trained cooperatively on factors of interest, but in an adversarial way on factors of distraction.

Face Recognition Font Recognition +1

Learning from Multi-domain Artistic Images for Arbitrary Style Transfer

1 code implementation25 May 2018 Zheng Xu, Michael Wilber, Chen Fang, Aaron Hertzmann, Hailin Jin

We propose a fast feed-forward network for arbitrary style transfer, which can generate stylized image for previously unseen content and style image pairs.

Style Transfer

Exploring Asymmetric Encoder-Decoder Structure for Context-based Sentence Representation Learning

no code implementations ICLR 2018 Shuai Tang, Hailin Jin, Chen Fang, Zhaowen Wang, Virginia R. de Sa

Context information plays an important role in human language understanding, and it is also useful for machines to learn vector representations of language.

Representation Learning

Sketching With Style: Visual Search With Sketches and Aesthetic Context

no code implementations ICCV 2017 John Collomosse, Tu Bui, Michael J. Wilber, Chen Fang, Hailin Jin

We propose a novel measure of visual similarity for image retrieval that incorporates both structural and aesthetic (style) constraints.

Image Retrieval

Spatial-Semantic Image Search by Visual Feature Synthesis

no code implementations CVPR 2017 Long Mai, Hailin Jin, Zhe Lin, Chen Fang, Jonathan Brandt, Feng Liu

We train a convolutional neural network to synthesize appropriate visual features that captures the spatial-semantic constraints from the user canvas query.

Image Retrieval

Trimming and Improving Skip-thought Vectors

no code implementations9 Jun 2017 Shuai Tang, Hailin Jin, Chen Fang, Zhaowen Wang, Virginia R. de Sa

The skip-thought model has been proven to be effective at learning sentence representations and capturing sentence semantics.

Text Classification

Rethinking Skip-thought: A Neighborhood based Approach

no code implementations WS 2017 Shuai Tang, Hailin Jin, Chen Fang, Zhaowen Wang, Virginia R. de Sa

We train our skip-thought neighbor model on a large corpus with continuous sentences, and then evaluate the trained model on 7 tasks, which include semantic relatedness, paraphrase detection, and classification benchmarks.

General Classification

BAM! The Behance Artistic Media Dataset for Recognition Beyond Photography

no code implementations ICCV 2017 Michael J. Wilber, Chen Fang, Hailin Jin, Aaron Hertzmann, John Collomosse, Serge Belongie

Furthermore, we carry out baseline experiments to show the value of this dataset for artistic style prediction, for improving the generality of existing object classifiers, and for the study of visual domain adaptation.

Domain Adaptation

Physically-Based Rendering for Indoor Scene Understanding Using Convolutional Neural Networks

no code implementations CVPR 2017 Yinda Zhang, Shuran Song, Ersin Yumer, Manolis Savva, Joon-Young Lee, Hailin Jin, Thomas Funkhouser

One of the bottlenecks in training for better representations is the amount of available per-pixel ground truth data that is required for core scene understanding tasks such as semantic segmentation, normal prediction, and object edge detection.

Boundary Detection Edge Detection +4

Composition-Preserving Deep Photo Aesthetics Assessment

no code implementations CVPR 2016 Long Mai, Hailin Jin, Feng Liu

Deep convolutional neural network (ConvNet) methods have recently shown promising results for aesthetics assessment.

Aesthetics Quality Assessment

Image Captioning with Semantic Attention

no code implementations CVPR 2016 Quanzeng You, Hailin Jin, Zhaowen Wang, Chen Fang, Jiebo Luo

Automatically generating a natural language description of an image has attracted interests recently both because of its importance in practical applications and because it connects two major artificial intelligence fields: computer vision and natural language processing.

Image Captioning

Multi-Instance Visual-Semantic Embedding

no code implementations22 Dec 2015 Zhou Ren, Hailin Jin, Zhe Lin, Chen Fang, Alan Yuille

Visual-semantic embedding models have been recently proposed and shown to be effective for image classification and zero-shot learning, by mapping images into a continuous semantic label space.

General Classification Image Classification +1

Robust Image Sentiment Analysis Using Progressively Trained and Domain Transferred Deep Networks

no code implementations20 Sep 2015 Quanzeng You, Jiebo Luo, Hailin Jin, Jianchao Yang

Sentiment analysis of such large scale visual content can help better extract user sentiments toward events or topics, such as those in image tweets, so that prediction of sentiment from visual content is complementary to textual sentiment analysis.

Sentiment Analysis

DeepFont: Identify Your Font from An Image

1 code implementation12 Jul 2015 Zhangyang Wang, Jianchao Yang, Hailin Jin, Eli Shechtman, Aseem Agarwala, Jonathan Brandt, Thomas S. Huang

As font is one of the core design concepts, automatic font identification and similar font suggestion from an image or photo has been on the wish list of many designers.

Domain Adaptation Font Recognition +1

Fine-Grained Recognition Without Part Annotations

no code implementations CVPR 2015 Jonathan Krause, Hailin Jin, Jianchao Yang, Li Fei-Fei

Scaling up fine-grained recognition to all domains of fine-grained objects is a challenge the computer vision community will need to face in order to realize its goal of recognizing all object categories.

Collaborative Feature Learning from Social Media

no code implementations CVPR 2015 Chen Fang, Hailin Jin, Jianchao Yang, Zhe Lin

We validate our feature learning paradigm on this dataset and find that the learned feature significantly outperforms the state-of-the-art image features in learning better image similarities.

Large-Scale Visual Font Recognition

no code implementations CVPR 2014 Guang Chen, Jianchao Yang, Hailin Jin, Jonathan Brandt, Eli Shechtman, Aseem Agarwala, Tony X. Han

This paper addresses the large-scale visual font recognition (VFR) problem, which aims at automatic identification of the typeface, weight, and slope of the text in an image or photo without any knowledge of content.

Font Recognition Image Categorization +1

GPU Asynchronous Stochastic Gradient Descent to Speed Up Neural Network Training

no code implementations21 Dec 2013 Thomas Paine, Hailin Jin, Jianchao Yang, Zhe Lin, Thomas Huang

The ability to train large-scale neural networks has resulted in state-of-the-art performance in many areas of computer vision.

Large Displacement Optical Flow from Nearest Neighbor Fields

no code implementations CVPR 2013 Zhuoyuan Chen, Hailin Jin, Zhe Lin, Scott Cohen, Ying Wu

We use approximate nearest neighbor fields to compute an initial motion field and use a robust algorithm to compute a set of similarity transformations as the motion candidates for segmentation.

Motion Estimation Motion Segmentation +1

Specular Reflection Separation Using Dark Channel Prior

no code implementations CVPR 2013 Hyeongwoo Kim, Hailin Jin, Sunil Hadap, In-So Kweon

Our method is based on a novel observation that for most natural images the dark channel can provide an approximate specular-free image.

Plane-Based Content Preserving Warps for Video Stabilization

no code implementations CVPR 2013 Zihan Zhou, Hailin Jin, Yi Ma

Recently, a new image deformation technique called content-preserving warping (CPW) has been successfully employed to produce the state-of-the-art video stabilization results in many challenging cases.

Frame Novel View Synthesis +1

Cannot find the paper you are looking for? You can Submit a new open access paper.