Search Results for author: Hailin Jin

Found 69 papers, 19 papers with code

Superpixel Segmentation with Fully Convolutional Networks

1 code implementation • CVPR 2020 • Fengting Yang, Qian Sun, Hailin Jin, Zihan Zhou

In computer vision, superpixels have been widely used as an effective way to reduce the number of image primitives for subsequent processing.

Disparity Estimation Segmentation +2

372

Paper
Code

TEACHTEXT: CrossModal Generalized Distillation for Text-Video Retrieval

1 code implementation • ICCV 2021 • Ioana Croitoru, Simion-Vlad Bogolin, Marius Leordeanu, Hailin Jin, Andrew Zisserman, Samuel Albanie, Yang Liu

In recent years, considerable progress on the task of text-video retrieval has been achieved by leveraging large-scale pretraining on visual and audio datasets to construct powerful video encoders.

Retrieval Video Retrieval

327

Paper
Code

DeepFont: Identify Your Font from An Image

1 code implementation • 12 Jul 2015 • Zhangyang Wang, Jianchao Yang, Hailin Jin, Eli Shechtman, Aseem Agarwala, Jonathan Brandt, Thomas S. Huang

As font is one of the core design concepts, automatic font identification and similar font suggestion from an image or photo has been on the wish list of many designers.

Ranked #1 on Font Recognition on VFR-Wild

Domain Adaptation Font Recognition +1

189

Paper
Code

Learning Video Representations from Correspondence Proposals

2 code implementations • CVPR 2019 • Xingyu Liu, Joon-Young Lee, Hailin Jin

In particular, it can effectively learn representations for videos by mixing appearance and long-range motion with an RGB-only input.

Ranked #1 on Action Recognition In Videos on Jester (Gesture Recognition)

Action Recognition In Videos

146

Paper
Code

Geo-PIFu: Geometry and Pixel Aligned Implicit Functions for Single-view Human Reconstruction

1 code implementation • NeurIPS 2020 • Tong He, John Collomosse, Hailin Jin, Stefano Soatto

We propose Geo-PIFu, a method to recover a 3D mesh from a monocular color image of a clothed person.

109

Paper
Code

An Internal Learning Approach to Video Inpainting

1 code implementation • ICCV 2019 • Haotian Zhang, Long Mai, Ning Xu, Zhaowen Wang, John Collomosse, Hailin Jin

We propose a novel video inpainting algorithm that simultaneously hallucinates missing appearance and motion (optical flow) information, building upon the recent 'Deep Image Prior' (DIP) that exploits convolutional network architectures to enforce plausible texture in static images.

Optical Flow Estimation Video Inpainting

Paper
Code

Cross Modal Retrieval with Querybank Normalisation

1 code implementation • CVPR 2022 • Simion-Vlad Bogolin, Ioana Croitoru, Hailin Jin, Yang Liu, Samuel Albanie

In this work we first show that, despite their effectiveness, state-of-the-art joint embeddings suffer significantly from the longstanding "hubness problem" in which a small number of gallery embeddings form the nearest neighbours of many queries.

Ranked #5 on Video Retrieval on QuerYD

Cross-Modal Retrieval Metric Learning +3

Paper
Code

Privacy-Preserving Deep Action Recognition: An Adversarial Learning Framework and A New Dataset

5 code implementations • 12 Jun 2019 • Zhen-Yu Wu, Haotao Wang, Zhaowen Wang, Hailin Jin, Zhangyang Wang

We first discuss an innovative heuristic of cross-dataset training and evaluation, enabling the use of multiple single-task datasets (one with target task labels and the other with privacy labels) in our problem.

Action Recognition Privacy Preserving +1

Paper
Code

Towards Privacy-Preserving Visual Recognition via Adversarial Training: A Pilot Study

3 code implementations • ECCV 2018 • Zhen-Yu Wu, Zhangyang Wang, Zhaowen Wang, Hailin Jin

This paper aims to improve privacy-preserving visual recognition, an increasingly demanded feature in smart camera applications, by formulating a unique adversarial training framework.

Action Recognition Privacy Preserving +1

Paper
Code

Learning from Multi-domain Artistic Images for Arbitrary Style Transfer

1 code implementation • 25 May 2018 • Zheng Xu, Michael Wilber, Chen Fang, Aaron Hertzmann, Hailin Jin

We propose a fast feed-forward network for arbitrary style transfer, which can generate stylized image for previously unseen content and style image pairs.

Style Transfer

Paper
Code

A Multi-Implicit Neural Representation for Fonts

1 code implementation • NeurIPS 2021 • Pradyumna Reddy, Zhifei Zhang, Matthew Fisher, Hailin Jin, Zhaowen Wang, Niloy J. Mitra

Fonts are ubiquitous across documents and come in a variety of styles.

Paper
Code

Efficient Adaptive Human-Object Interaction Detection with Concept-guided Memory

1 code implementation • ICCV 2023 • Ting Lei, Fabian Caba, Qingchao Chen, Hailin Jin, Yuxin Peng, Yang Liu

This observation motivates us to design an HOI detector that can be trained even with long-tailed labeled data and can leverage existing knowledge from pre-trained models.

Human-Object Interaction Detection Retrieval

Paper
Code

Moment Detection in Long Tutorial Videos

1 code implementation • ICCV 2023 • Ioana Croitoru, Simion-Vlad Bogolin, Samuel Albanie, Yang Liu, Zhaowen Wang, Seunghyun Yoon, Franck Dernoncourt, Hailin Jin, Trung Bui

To study this problem, we propose the first dataset of untrimmed, long-form tutorial videos for the task of Moment Detection called the Behance Moment Detection (BMD) dataset.

Paper
Code

Building a Large Scale Dataset for Image Emotion Recognition: The Fine Print and The Benchmark

2 code implementations • 9 May 2016 • Quanzeng You, Jiebo Luo, Hailin Jin, Jianchao Yang

We hope that this data set encourages further research on visual emotion analysis.

Benchmarking Emotion Recognition

Paper
Code

Compositional Sketch Search

1 code implementation • 15 Jun 2021 • Alexander Black, Tu Bui, Long Mai, Hailin Jin, John Collomosse

We present an algorithm for searching image collections using free-hand sketches that describe the appearance and relative positions of multiple objects.

Position Quantization +2

Paper
Code

Neural Architecture Search for Deep Image Prior

2 code implementations • 14 Jan 2020 • Kary Ho, Andrew Gilbert, Hailin Jin, John Collomosse

We present a neural architecture search (NAS) technique to enhance the performance of unsupervised image de-noising, in-painting and super-resolution under the recently proposed Deep Image Prior (DIP).

Image Restoration Neural Architecture Search +1

Paper
Code

Font Completion and Manipulation by Cycling Between Multi-Modality Representations

1 code implementation • 30 Aug 2021 • Ye Yuan, Wuyang Chen, Zhaowen Wang, Matthew Fisher, Zhifei Zhang, Zhangyang Wang, Hailin Jin

The novel graph constructor maps a glyph's latent code to its graph representation that matches expert knowledge, which is trained to help the translation task.

Image-to-Image Translation Representation Learning +2

Paper
Code

StreamHover: Livestream Transcript Summarization and Annotation

1 code implementation • EMNLP 2021 • Sangwoo Cho, Franck Dernoncourt, Tim Ganter, Trung Bui, Nedim Lipka, Walter Chang, Hailin Jin, Jonathan Brandt, Hassan Foroosh, Fei Liu

With the explosive growth of livestream broadcasting, there is an urgent need for new summarization technology that enables us to create a preview of streamed content and tap into this wealth of knowledge.

Extractive Summarization

Paper
Code

Black-Box Diagnosis and Calibration on GAN Intra-Mode Collapse: A Pilot Study

1 code implementation • 23 Jul 2021 • Zhenyu Wu, Zhaowen Wang, Ye Yuan, Jianming Zhang, Zhangyang Wang, Hailin Jin

Existing diversity tests of samples from GANs are usually conducted qualitatively on a small scale, and/or depends on the access to original training data as well as the trained model parameters.

Image Generation

Paper
Code

Speeding up Context-based Sentence Representation Learning with Non-autoregressive Convolutional Decoding

no code implementations • WS 2018 • Shuai Tang, Hailin Jin, Chen Fang, Zhaowen Wang, Virginia R. de Sa

We carefully designed experiments to show that neither an autoregressive decoder nor an RNN decoder is required.

Representation Learning Sentence

Paper
Add Code

Image Captioning at Will: A Versatile Scheme for Effectively Injecting Sentiments into Image Descriptions

no code implementations • 30 Jan 2018 • Quanzeng You, Hailin Jin, Jiebo Luo

In this work, we propose two different models, which employ different schemes for injecting sentiments into image captions.

Image Captioning Natural Language Understanding

Paper
Add Code

BAM! The Behance Artistic Media Dataset for Recognition Beyond Photography

no code implementations • ICCV 2017 • Michael J. Wilber, Chen Fang, Hailin Jin, Aaron Hertzmann, John Collomosse, Serge Belongie

Furthermore, we carry out baseline experiments to show the value of this dataset for artistic style prediction, for improving the generality of existing object classifiers, and for the study of visual domain adaptation.

Attribute Domain Adaptation

Paper
Add Code

Physically-Based Rendering for Indoor Scene Understanding Using Convolutional Neural Networks

no code implementations • CVPR 2017 • Yinda Zhang, Shuran Song, Ersin Yumer, Manolis Savva, Joon-Young Lee, Hailin Jin, Thomas Funkhouser

One of the bottlenecks in training for better representations is the amount of available per-pixel ground truth data that is required for core scene understanding tasks such as semantic segmentation, normal prediction, and object edge detection.

Boundary Detection Edge Detection +4

Paper
Add Code

Trimming and Improving Skip-thought Vectors

no code implementations • 9 Jun 2017 • Shuai Tang, Hailin Jin, Chen Fang, Zhaowen Wang, Virginia R. de Sa

The skip-thought model has been proven to be effective at learning sentence representations and capturing sentence semantics.

Sentence text-classification +1

Paper
Add Code

Rethinking Skip-thought: A Neighborhood based Approach

no code implementations • WS 2017 • Shuai Tang, Hailin Jin, Chen Fang, Zhaowen Wang, Virginia R. de Sa

We train our skip-thought neighbor model on a large corpus with continuous sentences, and then evaluate the trained model on 7 tasks, which include semantic relatedness, paraphrase detection, and classification benchmarks.

General Classification

Paper
Add Code

Image Captioning with Semantic Attention

no code implementations • CVPR 2016 • Quanzeng You, Hailin Jin, Zhaowen Wang, Chen Fang, Jiebo Luo

Automatically generating a natural language description of an image has attracted interests recently both because of its importance in practical applications and because it connects two major artificial intelligence fields: computer vision and natural language processing.

Image Captioning

Paper
Add Code

Multi-Instance Visual-Semantic Embedding

no code implementations • 22 Dec 2015 • Zhou Ren, Hailin Jin, Zhe Lin, Chen Fang, Alan Yuille

Visual-semantic embedding models have been recently proposed and shown to be effective for image classification and zero-shot learning, by mapping images into a continuous semantic label space.

General Classification Image Classification +1

Paper
Add Code

Robust Image Sentiment Analysis Using Progressively Trained and Domain Transferred Deep Networks

no code implementations • 20 Sep 2015 • Quanzeng You, Jiebo Luo, Hailin Jin, Jianchao Yang

Sentiment analysis of such large scale visual content can help better extract user sentiments toward events or topics, such as those in image tweets, so that prediction of sentiment from visual content is complementary to textual sentiment analysis.

Sentiment Analysis

Paper
Add Code

Collaborative Feature Learning from Social Media

no code implementations • CVPR 2015 • Chen Fang, Hailin Jin, Jianchao Yang, Zhe Lin

We validate our feature learning paradigm on this dataset and find that the learned feature significantly outperforms the state-of-the-art image features in learning better image similarities.

Paper
Add Code

Decomposition-Based Domain Adaptation for Real-World Font Recognition

no code implementations • 18 Dec 2014 • Zhangyang Wang, Jianchao Yang, Hailin Jin, Eli Shechtman, Aseem Agarwala, Jonathan Brandt, Thomas S. Huang

We present a domain adaption framework to address a domain mismatch between synthetic training and real-world testing data.

Domain Adaptation Font Recognition +1

Paper
Add Code

Real-World Font Recognition Using Deep Network and Domain Adaptation

no code implementations • 31 Mar 2015 • Zhangyang Wang, Jianchao Yang, Hailin Jin, Eli Shechtman, Aseem Agarwala, Jonathan Brandt, Thomas S. Huang

We address a challenging fine-grain classification problem: recognizing a font style from an image of text.

Domain Adaptation Font Recognition +1

Paper
Add Code

GPU Asynchronous Stochastic Gradient Descent to Speed Up Neural Network Training

no code implementations • 21 Dec 2013 • Thomas Paine, Hailin Jin, Jianchao Yang, Zhe Lin, Thomas Huang

The ability to train large-scale neural networks has resulted in state-of-the-art performance in many areas of computer vision.

Paper
Add Code

"Factual" or "Emotional": Stylized Image Captioning with Adaptive Learning and Attention

no code implementations • 10 Jul 2018 • Tianlang Chen, Zhongping Zhang, Quanzeng You, Chen Fang, Zhaowen Wang, Hailin Jin, Jiebo Luo

It uses two groups of matrices to capture the factual and stylized knowledge, respectively, and automatically learns the word-level weights of the two groups based on previous context.

Image Captioning

Paper
Add Code

Visual Font Pairing

no code implementations • 19 Nov 2018 • Shuhui Jiang, Zhaowen Wang, Aaron Hertzmann, Hailin Jin, Yun Fu

Third, font pairing is an asymmetric problem in that the roles played by header and body fonts are not interchangeable.

Metric Learning

Paper
Add Code

Disentangling Structure and Aesthetics for Style-Aware Image Completion

no code implementations • CVPR 2018 • Andrew Gilbert, John Collomosse, Hailin Jin, Brian Price

Content-aware image completion or in-painting is a fundamental tool for the correction of defects or removal of objects in images.

Paper
Add Code

Multi-Task Adversarial Network for Disentangled Feature Learning

no code implementations • CVPR 2018 • Yang Liu, Zhaowen Wang, Hailin Jin, Ian Wassell

The encoder and the discriminators are trained cooperatively on factors of interest, but in an adversarial way on factors of distraction.

Face Recognition Font Recognition +1

Paper
Add Code

What do I Annotate Next? An Empirical Study of Active Learning for Action Localization

no code implementations • ECCV 2018 • Fabian Caba Heilbron, Joon-Young Lee, Hailin Jin, Bernard Ghanem

In this paper, we introduce a novel active learning framework for temporal localization that aims to mitigate this data dependency issue.

Active Learning Temporal Action Localization +1

Paper
Add Code

Synthetically Supervised Feature Learning for Scene Text Recognition

no code implementations • ECCV 2018 • Yang Liu, Zhaowen Wang, Hailin Jin, Ian Wassell

We propose to leverage the parameters that lead to the output images to improve image feature learning.

Scene Text Recognition Synthetic Data Generation

Paper
Add Code

Interactive Boundary Prediction for Object Selection

no code implementations • ECCV 2018 • Hoang Le, Long Mai, Brian Price, Scott Cohen, Hailin Jin, Feng Liu

Instead of relying on pre-defined low-level image features, our method adaptively predicts object boundaries according to image content and user interactions.

Image Segmentation Interactive Segmentation +3

Paper
Add Code

``Factual'' or ``Emotional'': Stylized Image Captioning with Adaptive Learning and Attention

no code implementations • ECCV 2018 • Tianlang Chen, Zhongping Zhang, Quanzeng You, Chen Fang, Zhaowen Wang, Hailin Jin, Jiebo Luo

It uses two groups of matrices to capture the factual and stylized knowledge, respectively, and automatically learns the word-level weights of the two groups based on previous context.

Image Captioning

Paper
Add Code

Exploring Asymmetric Encoder-Decoder Structure for Context-based Sentence Representation Learning

no code implementations • ICLR 2018 • Shuai Tang, Hailin Jin, Chen Fang, Zhaowen Wang, Virginia R. de Sa

Context information plays an important role in human language understanding, and it is also useful for machines to learn vector representations of language.

Representation Learning Sentence

Paper
Add Code

Plane-Based Content Preserving Warps for Video Stabilization

no code implementations • CVPR 2013 • Zihan Zhou, Hailin Jin, Yi Ma

Recently, a new image deformation technique called content-preserving warping (CPW) has been successfully employed to produce the state-of-the-art video stabilization results in many challenging cases.

Novel View Synthesis Video Stabilization

Paper
Add Code

Large Displacement Optical Flow from Nearest Neighbor Fields

no code implementations • CVPR 2013 • Zhuoyuan Chen, Hailin Jin, Zhe Lin, Scott Cohen, Ying Wu

We use approximate nearest neighbor fields to compute an initial motion field and use a robust algorithm to compute a set of similarity transformations as the motion candidates for segmentation.

Motion Estimation Motion Segmentation +2

Paper
Add Code

Specular Reflection Separation Using Dark Channel Prior

no code implementations • CVPR 2013 • Hyeongwoo Kim, Hailin Jin, Sunil Hadap, In-So Kweon

Our method is based on a novel observation that for most natural images the dark channel can provide an approximate specular-free image.

Paper
Add Code

Fast Edge-Preserving PatchMatch for Large Displacement Optical Flow

no code implementations • CVPR 2014 • Linchao Bao, Qingxiong Yang, Hailin Jin

We present a fast optical flow algorithm that can handle large displacement motions.

Optical Flow Estimation

Paper
Add Code

Large-Scale Visual Font Recognition

no code implementations • CVPR 2014 • Guang Chen, Jianchao Yang, Hailin Jin, Jonathan Brandt, Eli Shechtman, Aseem Agarwala, Tony X. Han

This paper addresses the large-scale visual font recognition (VFR) problem, which aims at automatic identification of the typeface, weight, and slope of the text in an image or photo without any knowledge of content.

Ranked #1 on Font Recognition on VFR-447

Font Recognition Image Categorization +1

Paper
Add Code

Fine-Grained Recognition Without Part Annotations

no code implementations • CVPR 2015 • Jonathan Krause, Hailin Jin, Jianchao Yang, Li Fei-Fei

Scaling up fine-grained recognition to all domains of fine-grained objects is a challenge the computer vision community will need to face in order to realize its goal of recognizing all object categories.

Paper
Add Code

Composition-Preserving Deep Photo Aesthetics Assessment

no code implementations • CVPR 2016 • Long Mai, Hailin Jin, Feng Liu

Deep convolutional neural network (ConvNet) methods have recently shown promising results for aesthetics assessment.

Ranked #6 on Aesthetics Quality Assessment on AVA

Aesthetics Quality Assessment

Paper
Add Code

Spatial-Semantic Image Search by Visual Feature Synthesis

no code implementations • CVPR 2017 • Long Mai, Hailin Jin, Zhe Lin, Chen Fang, Jonathan Brandt, Feng Liu

We train a convolutional neural network to synthesize appropriate visual features that captures the spatial-semantic constraints from the user canvas query.

Image Retrieval Retrieval

Paper
Add Code

Sketching With Style: Visual Search With Sketches and Aesthetic Context

no code implementations • ICCV 2017 • John Collomosse, Tu Bui, Michael J. Wilber, Chen Fang, Hailin Jin

We propose a novel measure of visual similarity for image retrieval that incorporates both structural and aesthetic (style) constraints.

Image Retrieval Retrieval

Paper
Add Code

LiveSketch: Query Perturbations for Guided Sketch-based Visual Search

no code implementations • CVPR 2019 • John Collomosse, Tu Bui, Hailin Jin

LiveSketch is a novel algorithm for searching large image collections using hand-sketched queries.

Clustering

Paper
Add Code

Creative Procedural-Knowledge Extraction From Web Design Tutorials

no code implementations • 18 Apr 2019 • Longqi Yang, Chen Fang, Hailin Jin, Walter Chang, Deborah Estrin

Complex design tasks often require performing diverse actions in a specific order.

text-classification Text Classification

Paper
Add Code

Large-scale Tag-based Font Retrieval with Generative Feature Learning

no code implementations • ICCV 2019 • Tianlang Chen, Zhaowen Wang, Ning Xu, Hailin Jin, Jiebo Luo

In this paper, we address the problem of large-scale tag-based font retrieval which aims to bring semantics to the font selection process and enable people without expert knowledge to use fonts effectively.

Retrieval TAG

Paper
Add Code

Steering Self-Supervised Feature Learning Beyond Local Pixel Statistics

no code implementations • CVPR 2020 • Simon Jenni, Hailin Jin, Paolo Favaro

Based on this criterion, we introduce a novel image transformation that we call limited context inpainting (LCI).

Paper
Add Code

Video Question Answering on Screencast Tutorials

no code implementations • 2 Aug 2020 • Wentian Zhao, Seokhwan Kim, Ning Xu, Hailin Jin

This paper presents a new video question answering task on screencast tutorials.

Question Answering Video Question Answering

Paper
Add Code

ALADIN: All Layer Adaptive Instance Normalization for Fine-grained Style Similarity

no code implementations • ICCV 2021 • Dan Ruta, Saeid Motiian, Baldo Faieta, Zhe Lin, Hailin Jin, Alex Filipkowski, Andrew Gilbert, John Collomosse

We present ALADIN (All Layer AdaIN); a novel architecture for searching images based on the similarity of their artistic style.

Representation Learning

Paper
Add Code

Magic Layouts: Structural Prior for Component Detection in User Interface Designs

no code implementations • CVPR 2021 • Dipu Manandhar, Hailin Jin, John Collomosse

We present Magic Layouts; a method for parsing screenshots or hand-drawn sketches of user interface (UI) layouts.

Paper
Add Code

Cross-Sentence Temporal and Semantic Relations in Video Activity Localisation

no code implementations • ICCV 2021 • Jiabo Huang, Yang Liu, Shaogang Gong, Hailin Jin

Video activity localisation has recently attained increasing attention due to its practical values in automatically localising the most salient visual segments corresponding to their language descriptions (sentences) from untrimmed and unstructured videos.

Sentence

Paper
Add Code

Look at What I'm Doing: Self-Supervised Spatial Grounding of Narrations in Instructional Videos

no code implementations • 20 Oct 2021 • Reuben Tan, Bryan A. Plummer, Kate Saenko, Hailin Jin, Bryan Russell

Key to our approach is the ability to learn to spatially localize interactions with self-supervision on a large corpus of videos with accompanying transcribed narrations.

Paper
Add Code

Look at What I’m Doing: Self-Supervised Spatial Grounding of Narrations in Instructional Videos

no code implementations • NeurIPS 2021 • Reuben Tan, Bryan Plummer, Kate Saenko, Hailin Jin, Bryan Russell

Key to our approach is the ability to learn to spatially localize interactions with self-supervision on a large corpus of videos with accompanying transcribed narrations.

Paper
Add Code

Is There Mode Collapse? A Case Study on Face Generation and Its Black-box Calibration

no code implementations • 25 Sep 2019 • Zhenyu Wu, Ye Yuan, Zhaowen Wang, Jianming Zhang, Zhangyang Wang, Hailin Jin

Generative adversarial networks (GANs) nowadays are capable of producing im-ages of incredible realism.

Face Generation

Paper
Add Code

Time-Equivariant Contrastive Video Representation Learning

no code implementations • ICCV 2021 • Simon Jenni, Hailin Jin

We introduce a novel self-supervised contrastive learning method to learn representations from unlabelled videos.

Action Recognition Contrastive Learning +3

Paper
Add Code

StyleBabel: Artistic Style Tagging and Captioning

no code implementations • 10 Mar 2022 • Dan Ruta, Andrew Gilbert, Pranav Aggarwal, Naveen Marri, Ajinkya Kale, Jo Briggs, Chris Speed, Hailin Jin, Baldo Faieta, Alex Filipkowski, Zhe Lin, John Collomosse

We present StyleBabel, a unique open access dataset of natural language captions and free-form tags describing the artistic style of over 135K digital artworks, collected via a novel participatory method from experts studying at specialist art and design schools.

Attribute Representation Learning +2

Paper
Add Code

MHMS: Multimodal Hierarchical Multimedia Summarization

no code implementations • 7 Apr 2022 • JieLin Qiu, Jiacheng Zhu, Mengdi Xu, Franck Dernoncourt, Trung Bui, Zhaowen Wang, Bo Li, Ding Zhao, Hailin Jin

Multimedia summarization with multimodal output can play an essential role in real-world applications, i. e., automatically generating cover images and titles for news articles or providing introductions to online videos.

Paper
Add Code

Video Activity Localisation with Uncertainties in Temporal Boundary

no code implementations • 26 Jun 2022 • Jiabo Huang, Hailin Jin, Shaogang Gong, Yang Liu

Such uncertainties in temporal labelling are currently ignored in model training, resulting in learning mis-matched video-text correlation with poor generalisation in test.

Paper
Add Code

Semantics-Consistent Cross-domain Summarization via Optimal Transport Alignment

no code implementations • 10 Oct 2022 • JieLin Qiu, Jiacheng Zhu, Mengdi Xu, Franck Dernoncourt, Trung Bui, Zhaowen Wang, Bo Li, Ding Zhao, Hailin Jin

Multimedia summarization with multimodal output (MSMO) is a recently explored application in language grounding.

Paper
Add Code

LiveSeg: Unsupervised Multimodal Temporal Segmentation of Long Livestream Videos

no code implementations • 12 Oct 2022 • JieLin Qiu, Franck Dernoncourt, Trung Bui, Zhaowen Wang, Ding Zhao, Hailin Jin

Livestream videos have become a significant part of online learning, where design, digital marketing, creative painting, and other skills are taught by experienced experts in the sessions, making them valuable materials.

Marketing Segmentation

Paper
Add Code

Towards Generalisable Video Moment Retrieval: Visual-Dynamic Injection to Image-Text Pre-Training

no code implementations • CVPR 2023 • Dezhao Luo, Jiabo Huang, Shaogang Gong, Hailin Jin, Yang Liu

The correlation between the vision and text is essential for video moment retrieval (VMR), however, existing methods heavily rely on separate pre-training feature extractors for visual and textual understanding.

Moment Retrieval Retrieval

Paper
Add Code

Generative Video Diffusion for Unseen Cross-Domain Video Moment Retrieval

no code implementations • 24 Jan 2024 • Dezhao Luo, Shaogang Gong, Jiabo Huang, Hailin Jin, Yang Liu

We address two problems in video editing for optimising unseen domain VMR: (1) generation of high-quality simulation videos of different moments with subtle distinctions, (2) selection of simulation videos that complement existing source training videos without introducing harmful noise or unnecessary repetitions.

Moment Retrieval Retrieval +2

Paper
Add Code

Cannot find the paper you are looking for? You can Submit a new open access paper.