Search Results for author: Jinhui Tang

Found 44 papers, 14 papers with code

Video-Text Pre-training with Learned Regions

no code implementations2 Dec 2021 Rui Yan, Mike Zheng Shou, Yixiao Ge, Alex Jinpeng Wang, Xudong Lin, Guanyu Cai, Jinhui Tang

Video-Text pre-training aims at learning transferable representations from large-scale video-text pairs via aligning the semantics between visual and textual information.

Frame Representation Learning +1

Fine-Grained Image Analysis with Deep Learning: A Survey

no code implementations11 Nov 2021 Xiu-Shen Wei, Yi-Zhe Song, Oisin Mac Aodha, Jianxin Wu, Yuxin Peng, Jinhui Tang, Jian Yang, Serge Belongie

Fine-grained image analysis (FGIA) is a longstanding and fundamental problem in computer vision and pattern recognition, and underpins a diverse set of real-world applications.

Fine-Grained Image Recognition Image Retrieval

Improving OCR-Based Image Captioning by Incorporating Geometrical Relationship

no code implementations CVPR 2021 Jing Wang, Jinhui Tang, Mingkun Yang, Xiang Bai, Jiebo Luo

Under the guidance of the geometrical relationship between OCR tokens, our LSTM-R capitalizes on a newly-devised relation-aware pointer network to select OCR tokens from the scene text for OCR-based image captioning.

Image Captioning Optical Character Recognition

CTNet: Context-based Tandem Network for Semantic Segmentation

1 code implementation20 Apr 2021 Zechao Li, Yanpeng Sun, Jinhui Tang

Specifically, the Spatial Contextual Module (SCM) is leveraged to uncover the spatial contextual dependency between pixels by exploring the correlation between pixels and categories.

Semantic Segmentation

Interactive Fusion of Multi-level Features for Compositional Activity Recognition

1 code implementation10 Dec 2020 Rui Yan, Lingxi Xie, Xiangbo Shu, Jinhui Tang

To understand a complex action, multiple sources of information, including appearance, positional, and semantic features, need to be integrated.

Action Recognition

Feature Pyramid Transformer

1 code implementation ECCV 2020 Dong Zhang, Hanwang Zhang, Jinhui Tang, Meng Wang, Xiansheng Hua, Qianru Sun

Yet, the non-local spatial interactions are not across scales, and thus they fail to capture the non-local contexts of objects (or parts) residing in different scales.

Instance Segmentation Object Detection +1

Social Adaptive Module for Weakly-supervised Group Activity Recognition

no code implementations ECCV 2020 Rui Yan, Lingxi Xie, Jinhui Tang, Xiangbo Shu, Qi Tian

This paper presents a new task named weakly-supervised group activity recognition (GAR) which differs from conventional GAR tasks in that only video-level labels are available, yet the important persons within each frame are not provided even in the training data.

Frame Group Activity Recognition

Generalized Focal Loss: Learning Qualified and Distributed Bounding Boxes for Dense Object Detection

6 code implementations NeurIPS 2020 Xiang Li, Wenhai Wang, Lijun Wu, Shuo Chen, Xiaolin Hu, Jun Li, Jinhui Tang, Jian Yang

Specifically, we merge the quality estimation into the class prediction vector to form a joint representation of localization quality and classification, and use a vector to represent arbitrary distribution of box locations.

Classification Dense Object Detection +1

Cascaded Deep Video Deblurring Using Temporal Sharpness Prior

1 code implementation CVPR 2020 Jinshan Pan, Haoran Bai, Jinhui Tang

The proposed algorithm mainly consists of optical flow estimation from intermediate latent frames and latent frame restoration steps.

Ranked #3 on Deblurring on DVD (using extra training data)

Deblurring Frame +1

Deep Blind Video Super-resolution

2 code implementations ICCV 2021 Jinshan Pan, Songsheng Cheng, Jiawei Zhang, Jinhui Tang

Existing video super-resolution (SR) algorithms usually assume that the blur kernels in the degradation process are known and do not model the blur kernels in the restoration.

Image Deconvolution Image Restoration +2

Adaptive Context Network for Scene Parsing

no code implementations ICCV 2019 Jun Fu, Jing Liu, Yuhang Wang, Yong Li, Yongjun Bao, Jinhui Tang, Hanqing Lu

Recent works attempt to improve scene parsing performance by exploring different levels of contexts, and typically train a well-designed convolutional network to exploit useful contexts across all pixels equally.

Scene Parsing Semantic Segmentation

Spatiotemporal Co-attention Recurrent Neural Networks for Human-Skeleton Motion Prediction

no code implementations29 Sep 2019 Xiangbo Shu, Liyan Zhang, Guo-Jun Qi, Wei Liu, Jinhui Tang

To this end, we propose a novel Skeleton-joint Co-attention Recurrent Neural Networks (SC-RNN) to capture the spatial coherence among joints, and the temporal evolution among skeletons simultaneously on a skeleton-joint co-attention feature map in spatiotemporal space.

Human motion prediction motion prediction

Image Formation Model Guided Deep Image Super-Resolution

1 code implementation18 Aug 2019 Jinshan Pan, Yang Liu, Deqing Sun, Jimmy Ren, Ming-Ming Cheng, Jian Yang, Jinhui Tang

We present a simple and effective image super-resolution algorithm that imposes an image formation constraint on the deep neural networks via pixel substitution.

Image Super-Resolution

Aligning Linguistic Words and Visual Semantic Units for Image Captioning

1 code implementation6 Aug 2019 Longteng Guo, Jing Liu, Jinhui Tang, Jiangwei Li, Wei Luo, Hanqing Lu

Image captioning attempts to generate a sentence composed of several linguistic words, which are used to describe objects, attributes, and interactions in an image, denoted as visual semantic units in this paper.

Image Captioning

Convolutional Auto-encoding of Sentence Topics for Image Paragraph Generation

no code implementations1 Aug 2019 Jing Wang, Yingwei Pan, Ting Yao, Jinhui Tang, Tao Mei

A valid question is how to encapsulate such gists/topics that are worthy of mention from an image, and then describe the image from one topic to another but holistically with a coherent structure.

Image Paragraph Captioning

Modeling Embedding Dimension Correlations via Convolutional Neural Collaborative Filtering

1 code implementation26 Jun 2019 Xiaoyu Du, Xiangnan He, Fajie Yuan, Jinhui Tang, Zhiguang Qin, Tat-Seng Chua

In this work, we emphasize on modeling the correlations among embedding dimensions in neural networks to pursue higher effectiveness for CF.

Collaborative Filtering Recommendation Systems

Joint Label Prediction based Semi-Supervised Adaptive Concept Factorization for Robust Data Representation

no code implementations25 May 2019 Zhao Zhang, Yan Zhang, Guangcan Liu, Jinhui Tang, Shuicheng Yan, Meng Wang

To enrich prior knowledge to enhance the discrimination, RS2ACF clearly uses class information of labeled data and more importantly propagates it to unlabeled data by jointly learning an explicit label indicator for unlabeled data.

Deep Semantic Multimodal Hashing Network for Scalable Image-Text and Video-Text Retrievals

no code implementations9 Jan 2019 Lu Jin, Zechao Li, Jinhui Tang

In this article, we propose a novel deep semantic multimodal hashing network (DSMHN) for scalable image-text and video-text retrieval.

Cross-Modal Retrieval Representation Learning +1

Designing by Training: Acceleration Neural Network for Fast High-Dimensional Convolution

no code implementations NeurIPS 2018 Longquan Dai, Liang Tang, Yuan Xie, Jinhui Tang

Over the decades, people took a handmade approach to design fast algorithms for the Gaussian convolution.

Fast Matrix Factorization with Non-Uniform Weights on Missing Data

1 code implementation11 Nov 2018 Xiangnan He, Jinhui Tang, Xiaoyu Du, Richang Hong, Tongwei Ren, Tat-Seng Chua

This poses an imbalanced learning problem, since the scale of missing entries is usually much larger than that of observed entries, but they cannot be ignored due to the valuable negative signal.

Hierarchical Long Short-Term Concurrent Memory for Human Interaction Recognition

no code implementations1 Nov 2018 Xiangbo Shu, Jinhui Tang, Guo-Jun Qi, Wei Liu, Jian Yang

In a Co-LSTM unit, each sub-memory unit stores individual motion information, while this Co-LSTM unit selectively integrates and stores inter-related motion information between multiple interacting persons from multiple sub-memory units via the cell gate and co-memory cell, respectively.

Action Recognition Human Interaction Recognition

Adversarial Training Towards Robust Multimedia Recommender System

1 code implementation19 Sep 2018 Jinhui Tang, Xiaoyu Du, Xiangnan He, Fajie Yuan, Qi Tian, Tat-Seng Chua

To this end, we propose a novel solution named Adversarial Multimedia Recommendation (AMR), which can lead to a more robust multimedia recommender model by using adversarial learning.

Information Retrieval Multimedia

Outer Product-based Neural Collaborative Filtering

1 code implementation12 Aug 2018 Xiangnan He, Xiaoyu Du, Xiang Wang, Feng Tian, Jinhui Tang, Tat-Seng Chua

In this work, we contribute a new multi-layer neural network architecture named ONCF to perform collaborative filtering.

Collaborative Filtering

Physics-Based Generative Adversarial Models for Image Restoration and Beyond

no code implementations2 Aug 2018 Jinshan Pan, Jiangxin Dong, Yang Liu, Jiawei Zhang, Jimmy Ren, Jinhui Tang, Yu-Wing Tai, Ming-Hsuan Yang

We present an algorithm to directly solve numerous image restoration problems (e. g., image deblurring, image dehazing, image deraining, etc.).

Deblurring Image Deblurring +3

Single Image Dehazing via Conditional Generative Adversarial Network

no code implementations CVPR 2018 Runde Li, Jinshan Pan, Zechao Li, Jinhui Tang

In contrast, we solve this problem based on a conditional generative adversarial network (cGAN), where the clear image is estimated by an end-to-end trainable neural network.

Image Dehazing Single Image Dehazing

Deep Ordinal Hashing with Spatial Attention

no code implementations7 May 2018 Lu Jin, Xiangbo Shu, Kai Li, Zechao Li, Guo-Jun Qi, Jinhui Tang

However, most existing deep hashing methods directly learn the hash functions by encoding the global semantic information, while ignoring the local spatial information of images.

Image Retrieval

Social Anchor-Unit Graph Regularized Tensor Completion for Large-Scale Image Retagging

no code implementations12 Apr 2018 Jinhui Tang, Xiangbo Shu, Zechao Li, Yu-Gang Jiang, Qi Tian

Recent approaches simultaneously explore visual, user and tag information to improve the performance of image retagging by constructing and exploring an image-tag-user graph.

Graph Learning TAG

Hardware-Efficient Guided Image Filtering For Multi-Label Problem

no code implementations CVPR 2017 Longquan Dai, Mengke Yuan, Zechao Li, Xiaopeng Zhang, Jinhui Tang

In this paper we propose a hardware-efficient Guided Filter (HGF), which solves the efficiency problem of multichannel guided image filtering and yields competent results when applying it to multi-label problems with synthesized polynomial multichannel guidance.

Modeling Multimodal Clues in a Hybrid Deep Learning Framework for Video Classification

no code implementations14 Jun 2017 Yu-Gang Jiang, Zuxuan Wu, Jinhui Tang, Zechao Li, xiangyang xue, Shih-Fu Chang

More specifically, we utilize three Convolutional Neural Networks (CNNs) operating on appearance, motion and audio signals to extract their corresponding features.

General Classification Video Classification

Personalized Age Progression with Bi-level Aging Dictionary Learning

no code implementations4 Jun 2017 Xiangbo Shu, Jinhui Tang, Zechao Li, Hanjiang Lai, Liyan Zhang, Shuicheng Yan

Basically, for each age group, we learn an aging dictionary to reveal its aging characteristics (e. g., wrinkles), where the dictionary bases corresponding to the same index yet from two neighboring aging dictionaries form a particular aging pattern cross these two age groups, and a linear combination of all these patterns expresses a particular personalized aging process.

Dictionary Learning Face Verification

Concurrence-Aware Long Short-Term Sub-Memories for Person-Person Action Recognition

no code implementations3 Jun 2017 Xiangbo Shu, Jinhui Tang, Guo-Jun Qi, Yan Song, Zechao Li, Liyan Zhang

To this end, we propose a novel Concurrence-Aware Long Short-Term Sub-Memories (Co-LSTSM) to model the long-term inter-related dynamics between two interacting people on the bounding boxes covering people.

Action Recognition Frame

Deep Learning Driven Visual Path Prediction from a Single Image

no code implementations27 Jan 2016 Siyu Huang, Xi Li, Zhongfei Zhang, Zhouzhou He, Fei Wu, Wei Liu, Jinhui Tang, Yueting Zhuang

The highly effective visual representation and deep context models ensure that our framework makes a deep semantic understanding of the scene and motion pattern, consequently improving the performance of the visual path prediction task.

Human Parsing With Contextualized Convolutional Neural Network

no code implementations ICCV 2015 Xiaodan Liang, Chunyan Xu, Xiaohui Shen, Jianchao Yang, Si Liu, Jinhui Tang, Liang Lin, Shuicheng Yan

In this work, we address the human parsing task with a novel Contextualized Convolutional Neural Network (Co-CNN) architecture, which well integrates the cross-layer context, global image-level context, within-super-pixel context and cross-super-pixel neighborhood context into a unified network.

Human Parsing

Nonconvex Nonsmooth Low-Rank Minimization via Iteratively Reweighted Nuclear Norm

no code implementations23 Oct 2015 Canyi Lu, Jinhui Tang, Shuicheng Yan, Zhouchen Lin

The nuclear norm is widely used as a convex surrogate of the rank function in compressive sensing for low rank matrix recovery with its applications in image recovery and signal processing.

Compressive Sensing

Personalized Age Progression with Aging Dictionary

no code implementations ICCV 2015 Xiangbo Shu, Jinhui Tang, Hanjiang Lai, Luoqi Liu, Shuicheng Yan

Second, it is challenging or even impossible to collect faces of all age groups for a particular subject, yet much easier and more practical to get face pairs from neighboring age groups.

Dictionary Learning Face Verification

Sparse Composite Quantization

no code implementations CVPR 2015 Ting Zhang, Guo-Jun Qi, Jinhui Tang, Jingdong Wang

The benefit is that the distance evaluation between the query and the dictionary element (a sparse vector) is accelerated using the efficient sparse vector operation, and thus the cost of distance table computation is reduced a lot.


Correntropy Induced L2 Graph for Robust Subspace Clustering

no code implementations18 Jan 2015 Canyi Lu, Jinhui Tang, Min Lin, Liang Lin, Shuicheng Yan, Zhouchen Lin

In this paper, we study the robust subspace clustering problem, which aims to cluster the given possibly noisy data points into their underlying subspaces.

graph construction

Constructing a Non-Negative Low Rank and Sparse Graph with Data-Adaptive Features

no code implementations3 Sep 2014 Liansheng Zhuang, Shenghua Gao, Jinhui Tang, Jingjing Wang, Zhouchen Lin, Yi Ma

This paper aims at constructing a good graph for discovering intrinsic data structures in a semi-supervised learning setting.

graph construction

Generalized Nonconvex Nonsmooth Low-Rank Minimization

no code implementations CVPR 2014 Canyi Lu, Jinhui Tang, Shuicheng Yan, Zhouchen Lin

We observe that all the existing nonconvex penalty functions are concave and monotonically increasing on $[0,\infty)$.

Weakly-Supervised Dual Clustering for Image Semantic Segmentation

no code implementations CVPR 2013 Yang Liu, Jing Liu, Zechao Li, Jinhui Tang, Hanqing Lu

In this paper, we propose a novel Weakly-Supervised Dual Clustering (WSDC) approach for image semantic segmentation with image-level labels, i. e., collaboratively performing image segmentation and tag alignment with those regions.

Semantic Segmentation Superpixels +1

Cannot find the paper you are looking for? You can Submit a new open access paper.