no code implementations • 25 Jun 2024 • Weitong Cai, Jiabo Huang, Shaogang Gong, Hailin Jin, Yang Liu
This intrinsic modality imbalance leaves a considerable portion of visual information remaining unaligned with text.
no code implementations • 24 Jan 2024 • Dezhao Luo, Shaogang Gong, Jiabo Huang, Hailin Jin, Yang Liu
We address two problems in video editing for optimising unseen domain VMR: (1) generation of high-quality simulation videos of different moments with subtle distinctions, (2) selection of simulation videos that complement existing source training videos without introducing harmful noise or unnecessary repetitions.
1 code implementation • ICCV 2023 • Ting Lei, Fabian Caba, Qingchao Chen, Hailin Jin, Yuxin Peng, Yang Liu
This observation motivates us to design an HOI detector that can be trained even with long-tailed labeled data and can leverage existing knowledge from pre-trained models.
no code implementations • CVPR 2023 • Dezhao Luo, Jiabo Huang, Shaogang Gong, Hailin Jin, Yang Liu
The correlation between the vision and text is essential for video moment retrieval (VMR), however, existing methods heavily rely on separate pre-training feature extractors for visual and textual understanding.
1 code implementation • ICCV 2023 • Ioana Croitoru, Simion-Vlad Bogolin, Samuel Albanie, Yang Liu, Zhaowen Wang, Seunghyun Yoon, Franck Dernoncourt, Hailin Jin, Trung Bui
To study this problem, we propose the first dataset of untrimmed, long-form tutorial videos for the task of Moment Detection called the Behance Moment Detection (BMD) dataset.
no code implementations • 12 Oct 2022 • JieLin Qiu, Franck Dernoncourt, Trung Bui, Zhaowen Wang, Ding Zhao, Hailin Jin
Livestream videos have become a significant part of online learning, where design, digital marketing, creative painting, and other skills are taught by experienced experts in the sessions, making them valuable materials.
no code implementations • 10 Oct 2022 • JieLin Qiu, Jiacheng Zhu, Mengdi Xu, Franck Dernoncourt, Trung Bui, Zhaowen Wang, Bo Li, Ding Zhao, Hailin Jin
Multimedia summarization with multimodal output (MSMO) is a recently explored application in language grounding.
no code implementations • 26 Jun 2022 • Jiabo Huang, Hailin Jin, Shaogang Gong, Yang Liu
Such uncertainties in temporal labelling are currently ignored in model training, resulting in learning mis-matched video-text correlation with poor generalisation in test.
no code implementations • 7 Apr 2022 • JieLin Qiu, Jiacheng Zhu, Mengdi Xu, Franck Dernoncourt, Trung Bui, Zhaowen Wang, Bo Li, Ding Zhao, Hailin Jin
Multimedia summarization with multimodal output can play an essential role in real-world applications, i. e., automatically generating cover images and titles for news articles or providing introductions to online videos.
no code implementations • 10 Mar 2022 • Dan Ruta, Andrew Gilbert, Pranav Aggarwal, Naveen Marri, Ajinkya Kale, Jo Briggs, Chris Speed, Hailin Jin, Baldo Faieta, Alex Filipkowski, Zhe Lin, John Collomosse
We present StyleBabel, a unique open access dataset of natural language captions and free-form tags describing the artistic style of over 135K digital artworks, collected via a novel participatory method from experts studying at specialist art and design schools.
1 code implementation • CVPR 2022 • Simion-Vlad Bogolin, Ioana Croitoru, Hailin Jin, Yang Liu, Samuel Albanie
In this work we first show that, despite their effectiveness, state-of-the-art joint embeddings suffer significantly from the longstanding "hubness problem" in which a small number of gallery embeddings form the nearest neighbours of many queries.
Ranked #5 on
Video Retrieval
on QuerYD
no code implementations • ICCV 2021 • Simon Jenni, Hailin Jin
We introduce a novel self-supervised contrastive learning method to learn representations from unlabelled videos.
no code implementations • NeurIPS 2021 • Reuben Tan, Bryan Plummer, Kate Saenko, Hailin Jin, Bryan Russell
Key to our approach is the ability to learn to spatially localize interactions with self-supervision on a large corpus of videos with accompanying transcribed narrations.
no code implementations • 20 Oct 2021 • Reuben Tan, Bryan A. Plummer, Kate Saenko, Hailin Jin, Bryan Russell
Key to our approach is the ability to learn to spatially localize interactions with self-supervision on a large corpus of videos with accompanying transcribed narrations.
1 code implementation • EMNLP 2021 • Sangwoo Cho, Franck Dernoncourt, Tim Ganter, Trung Bui, Nedim Lipka, Walter Chang, Hailin Jin, Jonathan Brandt, Hassan Foroosh, Fei Liu
With the explosive growth of livestream broadcasting, there is an urgent need for new summarization technology that enables us to create a preview of streamed content and tap into this wealth of knowledge.
1 code implementation • 30 Aug 2021 • Ye Yuan, Wuyang Chen, Zhaowen Wang, Matthew Fisher, Zhifei Zhang, Zhangyang Wang, Hailin Jin
The novel graph constructor maps a glyph's latent code to its graph representation that matches expert knowledge, which is trained to help the translation task.
no code implementations • ICCV 2021 • Jiabo Huang, Yang Liu, Shaogang Gong, Hailin Jin
Video activity localisation has recently attained increasing attention due to its practical values in automatically localising the most salient visual segments corresponding to their language descriptions (sentences) from untrimmed and unstructured videos.
1 code implementation • 23 Jul 2021 • Zhenyu Wu, Zhaowen Wang, Ye Yuan, Jianming Zhang, Zhangyang Wang, Hailin Jin
Existing diversity tests of samples from GANs are usually conducted qualitatively on a small scale, and/or depends on the access to original training data as well as the trained model parameters.
1 code implementation • 15 Jun 2021 • Alexander Black, Tu Bui, Long Mai, Hailin Jin, John Collomosse
We present an algorithm for searching image collections using free-hand sketches that describe the appearance and relative positions of multiple objects.
no code implementations • CVPR 2021 • Dipu Manandhar, Hailin Jin, John Collomosse
We present Magic Layouts; a method for parsing screenshots or hand-drawn sketches of user interface (UI) layouts.
1 code implementation • NeurIPS 2021 • Pradyumna Reddy, Zhifei Zhang, Matthew Fisher, Hailin Jin, Zhaowen Wang, Niloy J. Mitra
Fonts are ubiquitous across documents and come in a variety of styles.
1 code implementation • ICCV 2021 • Ioana Croitoru, Simion-Vlad Bogolin, Marius Leordeanu, Hailin Jin, Andrew Zisserman, Samuel Albanie, Yang Liu
In recent years, considerable progress on the task of text-video retrieval has been achieved by leveraging large-scale pretraining on visual and audio datasets to construct powerful video encoders.
no code implementations • ICCV 2021 • Dan Ruta, Saeid Motiian, Baldo Faieta, Zhe Lin, Hailin Jin, Alex Filipkowski, Andrew Gilbert, John Collomosse
We present ALADIN (All Layer AdaIN); a novel architecture for searching images based on the similarity of their artistic style.
no code implementations • 2 Aug 2020 • Wentian Zhao, Seokhwan Kim, Ning Xu, Hailin Jin
This paper presents a new video question answering task on screencast tutorials.
1 code implementation • NeurIPS 2020 • Tong He, John Collomosse, Hailin Jin, Stefano Soatto
We propose Geo-PIFu, a method to recover a 3D mesh from a monocular color image of a clothed person.
no code implementations • CVPR 2020 • Simon Jenni, Hailin Jin, Paolo Favaro
Based on this criterion, we introduce a novel image transformation that we call limited context inpainting (LCI).
1 code implementation • CVPR 2020 • Fengting Yang, Qian Sun, Hailin Jin, Zihan Zhou
In computer vision, superpixels have been widely used as an effective way to reduce the number of image primitives for subsequent processing.
2 code implementations • 14 Jan 2020 • Kary Ho, Andrew Gilbert, Hailin Jin, John Collomosse
We present a neural architecture search (NAS) technique to enhance the performance of unsupervised image de-noising, in-painting and super-resolution under the recently proposed Deep Image Prior (DIP).
no code implementations • 25 Sep 2019 • Zhenyu Wu, Ye Yuan, Zhaowen Wang, Jianming Zhang, Zhangyang Wang, Hailin Jin
Generative adversarial networks (GANs) nowadays are capable of producing im-ages of incredible realism.
1 code implementation • ICCV 2019 • Haotian Zhang, Long Mai, Ning Xu, Zhaowen Wang, John Collomosse, Hailin Jin
We propose a novel video inpainting algorithm that simultaneously hallucinates missing appearance and motion (optical flow) information, building upon the recent 'Deep Image Prior' (DIP) that exploits convolutional network architectures to enforce plausible texture in static images.
no code implementations • ICCV 2019 • Tianlang Chen, Zhaowen Wang, Ning Xu, Hailin Jin, Jiebo Luo
In this paper, we address the problem of large-scale tag-based font retrieval which aims to bring semantics to the font selection process and enable people without expert knowledge to use fonts effectively.
5 code implementations • 12 Jun 2019 • Zhen-Yu Wu, Haotao Wang, Zhaowen Wang, Hailin Jin, Zhangyang Wang
We first discuss an innovative heuristic of cross-dataset training and evaluation, enabling the use of multiple single-task datasets (one with target task labels and the other with privacy labels) in our problem.
2 code implementations • CVPR 2019 • Xingyu Liu, Joon-Young Lee, Hailin Jin
In particular, it can effectively learn representations for videos by mixing appearance and long-range motion with an RGB-only input.
no code implementations • 18 Apr 2019 • Longqi Yang, Chen Fang, Hailin Jin, Walter Chang, Deborah Estrin
Complex design tasks often require performing diverse actions in a specific order.
no code implementations • CVPR 2019 • John Collomosse, Tu Bui, Hailin Jin
LiveSketch is a novel algorithm for searching large image collections using hand-sketched queries.
no code implementations • 19 Nov 2018 • Shuhui Jiang, Zhaowen Wang, Aaron Hertzmann, Hailin Jin, Yun Fu
Third, font pairing is an asymmetric problem in that the roles played by header and body fonts are not interchangeable.
no code implementations • ECCV 2018 • Hoang Le, Long Mai, Brian Price, Scott Cohen, Hailin Jin, Feng Liu
Instead of relying on pre-defined low-level image features, our method adaptively predicts object boundaries according to image content and user interactions.
no code implementations • ECCV 2018 • Fabian Caba Heilbron, Joon-Young Lee, Hailin Jin, Bernard Ghanem
In this paper, we introduce a novel active learning framework for temporal localization that aims to mitigate this data dependency issue.
no code implementations • ECCV 2018 • Yang Liu, Zhaowen Wang, Hailin Jin, Ian Wassell
We propose to leverage the parameters that lead to the output images to improve image feature learning.
no code implementations • ECCV 2018 • Tianlang Chen, Zhongping Zhang, Quanzeng You, Chen Fang, Zhaowen Wang, Hailin Jin, Jiebo Luo
It uses two groups of matrices to capture the factual and stylized knowledge, respectively, and automatically learns the word-level weights of the two groups based on previous context.
3 code implementations • ECCV 2018 • Zhen-Yu Wu, Zhangyang Wang, Zhaowen Wang, Hailin Jin
This paper aims to improve privacy-preserving visual recognition, an increasingly demanded feature in smart camera applications, by formulating a unique adversarial training framework.
no code implementations • 10 Jul 2018 • Tianlang Chen, Zhongping Zhang, Quanzeng You, Chen Fang, Zhaowen Wang, Hailin Jin, Jiebo Luo
It uses two groups of matrices to capture the factual and stylized knowledge, respectively, and automatically learns the word-level weights of the two groups based on previous context.
no code implementations • CVPR 2018 • Yang Liu, Zhaowen Wang, Hailin Jin, Ian Wassell
The encoder and the discriminators are trained cooperatively on factors of interest, but in an adversarial way on factors of distraction.
no code implementations • CVPR 2018 • Andrew Gilbert, John Collomosse, Hailin Jin, Brian Price
Content-aware image completion or in-painting is a fundamental tool for the correction of defects or removal of objects in images.
1 code implementation • 25 May 2018 • Zheng Xu, Michael Wilber, Chen Fang, Aaron Hertzmann, Hailin Jin
We propose a fast feed-forward network for arbitrary style transfer, which can generate stylized image for previously unseen content and style image pairs.
no code implementations • 30 Jan 2018 • Quanzeng You, Hailin Jin, Jiebo Luo
In this work, we propose two different models, which employ different schemes for injecting sentiments into image captions.
no code implementations • ICLR 2018 • Shuai Tang, Hailin Jin, Chen Fang, Zhaowen Wang, Virginia R. de Sa
Context information plays an important role in human language understanding, and it is also useful for machines to learn vector representations of language.
no code implementations • WS 2018 • Shuai Tang, Hailin Jin, Chen Fang, Zhaowen Wang, Virginia R. de Sa
We carefully designed experiments to show that neither an autoregressive decoder nor an RNN decoder is required.
no code implementations • ICCV 2017 • John Collomosse, Tu Bui, Michael J. Wilber, Chen Fang, Hailin Jin
We propose a novel measure of visual similarity for image retrieval that incorporates both structural and aesthetic (style) constraints.
no code implementations • CVPR 2017 • Long Mai, Hailin Jin, Zhe Lin, Chen Fang, Jonathan Brandt, Feng Liu
We train a convolutional neural network to synthesize appropriate visual features that captures the spatial-semantic constraints from the user canvas query.
no code implementations • WS 2017 • Shuai Tang, Hailin Jin, Chen Fang, Zhaowen Wang, Virginia R. de Sa
We train our skip-thought neighbor model on a large corpus with continuous sentences, and then evaluate the trained model on 7 tasks, which include semantic relatedness, paraphrase detection, and classification benchmarks.
no code implementations • 9 Jun 2017 • Shuai Tang, Hailin Jin, Chen Fang, Zhaowen Wang, Virginia R. de Sa
The skip-thought model has been proven to be effective at learning sentence representations and capturing sentence semantics.
no code implementations • ICCV 2017 • Michael J. Wilber, Chen Fang, Hailin Jin, Aaron Hertzmann, John Collomosse, Serge Belongie
Furthermore, we carry out baseline experiments to show the value of this dataset for artistic style prediction, for improving the generality of existing object classifiers, and for the study of visual domain adaptation.
no code implementations • CVPR 2017 • Yinda Zhang, Shuran Song, Ersin Yumer, Manolis Savva, Joon-Young Lee, Hailin Jin, Thomas Funkhouser
One of the bottlenecks in training for better representations is the amount of available per-pixel ground truth data that is required for core scene understanding tasks such as semantic segmentation, normal prediction, and object edge detection.
2 code implementations • CVPR 2016 • Long Mai, Hailin Jin, Feng Liu
Deep convolutional neural network (ConvNet) methods have recently shown promising results for aesthetics assessment.
Ranked #6 on
Aesthetics Quality Assessment
on AVA
2 code implementations • 9 May 2016 • Quanzeng You, Jiebo Luo, Hailin Jin, Jianchao Yang
We hope that this data set encourages further research on visual emotion analysis.
no code implementations • CVPR 2016 • Quanzeng You, Hailin Jin, Zhaowen Wang, Chen Fang, Jiebo Luo
Automatically generating a natural language description of an image has attracted interests recently both because of its importance in practical applications and because it connects two major artificial intelligence fields: computer vision and natural language processing.
no code implementations • 22 Dec 2015 • Zhou Ren, Hailin Jin, Zhe Lin, Chen Fang, Alan Yuille
Visual-semantic embedding models have been recently proposed and shown to be effective for image classification and zero-shot learning, by mapping images into a continuous semantic label space.
no code implementations • 20 Sep 2015 • Quanzeng You, Jiebo Luo, Hailin Jin, Jianchao Yang
Sentiment analysis of such large scale visual content can help better extract user sentiments toward events or topics, such as those in image tweets, so that prediction of sentiment from visual content is complementary to textual sentiment analysis.
1 code implementation • 12 Jul 2015 • Zhangyang Wang, Jianchao Yang, Hailin Jin, Eli Shechtman, Aseem Agarwala, Jonathan Brandt, Thomas S. Huang
As font is one of the core design concepts, automatic font identification and similar font suggestion from an image or photo has been on the wish list of many designers.
Ranked #1 on
Font Recognition
on VFR-Wild
no code implementations • CVPR 2015 • Jonathan Krause, Hailin Jin, Jianchao Yang, Li Fei-Fei
Scaling up fine-grained recognition to all domains of fine-grained objects is a challenge the computer vision community will need to face in order to realize its goal of recognizing all object categories.
no code implementations • 31 Mar 2015 • Zhangyang Wang, Jianchao Yang, Hailin Jin, Eli Shechtman, Aseem Agarwala, Jonathan Brandt, Thomas S. Huang
We address a challenging fine-grain classification problem: recognizing a font style from an image of text.
no code implementations • CVPR 2015 • Chen Fang, Hailin Jin, Jianchao Yang, Zhe Lin
We validate our feature learning paradigm on this dataset and find that the learned feature significantly outperforms the state-of-the-art image features in learning better image similarities.
no code implementations • 18 Dec 2014 • Zhangyang Wang, Jianchao Yang, Hailin Jin, Eli Shechtman, Aseem Agarwala, Jonathan Brandt, Thomas S. Huang
We present a domain adaption framework to address a domain mismatch between synthetic training and real-world testing data.
no code implementations • CVPR 2014 • Linchao Bao, Qingxiong Yang, Hailin Jin
We present a fast optical flow algorithm that can handle large displacement motions.
no code implementations • CVPR 2014 • Guang Chen, Jianchao Yang, Hailin Jin, Jonathan Brandt, Eli Shechtman, Aseem Agarwala, Tony X. Han
This paper addresses the large-scale visual font recognition (VFR) problem, which aims at automatic identification of the typeface, weight, and slope of the text in an image or photo without any knowledge of content.
Ranked #1 on
Font Recognition
on VFR-447
no code implementations • 21 Dec 2013 • Thomas Paine, Hailin Jin, Jianchao Yang, Zhe Lin, Thomas Huang
The ability to train large-scale neural networks has resulted in state-of-the-art performance in many areas of computer vision.
no code implementations • CVPR 2013 • Hyeongwoo Kim, Hailin Jin, Sunil Hadap, In-So Kweon
Our method is based on a novel observation that for most natural images the dark channel can provide an approximate specular-free image.
no code implementations • CVPR 2013 • Zihan Zhou, Hailin Jin, Yi Ma
Recently, a new image deformation technique called content-preserving warping (CPW) has been successfully employed to produce the state-of-the-art video stabilization results in many challenging cases.
no code implementations • CVPR 2013 • Zhuoyuan Chen, Hailin Jin, Zhe Lin, Scott Cohen, Ying Wu
We use approximate nearest neighbor fields to compute an initial motion field and use a robust algorithm to compute a set of similarity transformations as the motion candidates for segmentation.