Occlusion-Aware Siamese Network for Human Pose Estimation

no code implementations ECCV 2020 Lu Zhou, Yingying Chen, Yunze Gao, Jinqiao Wang, Hanqing Lu

To overcome the defects caused by the erasing operation, we perform feature reconstruction to recover the information destroyed by occlusion and details lost in cleaning procedure.

Pose Estimation

Decoupling GCN with DropGraph Module for Skeleton-Based Action Recognition

1 code implementation ECCV 2020 Ke Cheng, Yifan Zhang, Congqi Cao, Lei Shi, Jian Cheng, Hanqing Lu

Nevertheless, how to efficiently model the spatial-temporal skeleton graph without introducing extra computation burden is a challenging problem for industrial deployment.

Action Recognition Skeleton Based Action Recognition

Multilingual Knowledge Graph Completion with Self-Supervised Adaptive Graph Alignment

1 code implementation ACL 2022 Zijie Huang, Zheng Li, Haoming Jiang, Tianyu Cao, Hanqing Lu, Bing Yin, Karthik Subbian, Yizhou Sun, Wei Wang

In this paper, we explore multilingual KG completion, which leverages limited seed alignment as a bridge, to embrace the collective knowledge from multiple languages.

Knowledge Graph Completion

QUEACO: Borrowing Treasures from Weakly-labeled Behavior Data for Query Attribute Value Extraction

no code implementations19 Aug 2021 Danqing Zhang, Zheng Li, Tianyu Cao, Chen Luo, Tony Wu, Hanqing Lu, Yiwei Song, Bing Yin, Tuo Zhao, Qiang Yang

We study the problem of query attribute value extraction, which aims to identify named entities from user queries as diverse surface form attribute values and afterward transform them into formally canonical forms.

Attribute Value Extraction named-entity-recognition +2

OPT: Omni-Perception Pre-Trainer for Cross-Modal Understanding and Generation

2 code implementations1 Jul 2021 Jing Liu, Xinxin Zhu, Fei Liu, Longteng Guo, Zijia Zhao, Mingzhen Sun, Weining Wang, Hanqing Lu, Shiyu Zhou, Jiajun Zhang, Jinqiao Wang

In this paper, we propose an Omni-perception Pre-Trainer (OPT) for cross-modal understanding and generation, by jointly modeling visual, text and audio resources.

Audio to Text Retrieval Cross-Modal Retrieval +3

Improving Multiple Object Tracking With Single Object Tracking

no code implementations CVPR 2021 Linyu Zheng, Ming Tang, Yingying Chen, Guibo Zhu, Jinqiao Wang, Hanqing Lu

Despite considerable similarities between multiple object tracking (MOT) and single object tracking (SOT) tasks, modern MOT methods have not benefited from the development of SOT ones to achieve satisfactory performance.

Multiple Object Tracking object-detection +1

Graph-based Multilingual Product Retrieval in E-commerce Search

no code implementations NAACL 2021 Hanqing Lu, Youna Hu, Tong Zhao, Tony Wu, Yiwei Song, Bing Yin

Nowadays, with many e-commerce platforms conducting global business, e-commerce search systems are required to handle product retrieval under multilingual scenarios.

Graph Attention Retrieval

AdaSGN: Adapting Joint Number and Model Size for Efficient Skeleton-Based Action Recognition

no code implementations ICCV 2021 Lei Shi, Yifan Zhang, Jian Cheng, Hanqing Lu

Existing methods for skeleton-based action recognition mainly focus on improving the recognition accuracy, whereas the efficiency of the model is rarely considered.

Action Recognition Skeleton Based Action Recognition

Fast Sequence Generation with Multi-Agent Reinforcement Learning

no code implementations24 Jan 2021 Longteng Guo, Jing Liu, Xinxin Zhu, Hanqing Lu

These models are autoregressive in that they generate each word by conditioning on previously generated words, which leads to heavy latency during inference.

Image Captioning Machine Translation +4

HAIR: Hierarchical Visual-Semantic Relational Reasoning for Video Question Answering

no code implementations ICCV 2021 Fei Liu, Jing Liu, Weining Wang, Hanqing Lu

Specifically, we present a novel graph memory mechanism to perform relational reasoning, and further develop two types of graph memory: a) visual graph memory that leverages visual information of video for relational reasoning; b) semantic graph memory that is specifically designed to explicitly leverage semantic knowledge contained in the classes and attributes of video objects, and perform relational reasoning in the semantic space.

Question Answering Relational Reasoning +1

High-Performance Discriminative Tracking With Transformers

no code implementations ICCV 2021 Bin Yu, Ming Tang, Linyu Zheng, Guibo Zhu, Jinqiao Wang, Hao Feng, Xuetao Feng, Hanqing Lu

End-to-end discriminative trackers improve the state of the art significantly, yet the improvement in robustness and efficiency is restricted by the conventional discriminative model, i. e., least-squares based regression.

Visual Tracking Vocal Bursts Intensity Prediction

Scene Segmentation with Dual Relation-aware Attention Network

1 code implementation TNNLS 2020 Jun Fu, Jing Liu, Jie Jiang, Yong Li, Yongjun Bao, Hanqing Lu

We conduct extensive experiments to validate the effectiveness of our network and achieve new state-of-the-art segmentation performance on four challenging scene segmentation data sets, i. e., Cityscapes, ADE20K, PASCAL Context, and COCO Stuff data sets.

Scene Segmentation

Decoupled Spatial-Temporal Attention Network for Skeleton-Based Action Recognition

1 code implementation7 Jul 2020 Lei Shi, Yifan Zhang, Jian Cheng, Hanqing Lu

Besides, from the data aspect, we introduce a skeletal data decoupling technique to emphasize the specific characteristics of space/time and different motion scales, resulting in a more comprehensive understanding of the human actions. To test the effectiveness of the proposed method, extensive experiments are conducted on four challenging datasets for skeleton-based gesture and action recognition, namely, SHREC, DHG, NTU-60 and NTU-120, where DSTA-Net achieves state-of-the-art performance on all of them.

Action Recognition Skeleton Based Action Recognition +1

Non-Autoregressive Image Captioning with Counterfactuals-Critical Multi-Agent Learning

no code implementations10 May 2020 Longteng Guo, Jing Liu, Xinxin Zhu, Xingjian He, Jie Jiang, Hanqing Lu

In this paper, we propose a Non-Autoregressive Image Captioning (NAIC) model with a novel training paradigm: Counterfactuals-critical Multi-Agent Learning (CMAL).

Image Captioning Machine Translation +2

What and Where: Modeling Skeletons from Semantic and Spatial Perspectives for Action Recognition

no code implementations7 Apr 2020 Lei Shi, Yifan Zhang, Jian Cheng, Hanqing Lu

The two perspectives are orthogonal and complementary to each other; and by fusing them in a unified framework, our method achieves a more comprehensive understanding of the skeleton data.

Action Recognition Gesture Recognition +2

Skeleton-Based Action Recognition with Multi-Stream Adaptive Graph Convolutional Networks

3 code implementations15 Dec 2019 Lei Shi, Yifan Zhang, Jian Cheng, Hanqing Lu

Second, the second-order information of the skeleton data, i. e., the length and orientation of the bones, is rarely investigated, which is naturally more informative and discriminative for the human action recognition.

Action Recognition graph construction +2

Action Recognition via Pose-Based Graph Convolutional Networks with Intermediate Dense Supervision

no code implementations28 Nov 2019 Lei Shi, Yifan Zhang, Jian Cheng, Hanqing Lu

Existing methods exploit the joint positions to extract the body-part features from the activation map of the convolutional networks to assist human action recognition.

Action Recognition Skeleton Based Action Recognition +1

Adaptive Context Network for Scene Parsing

no code implementations ICCV 2019 Jun Fu, Jing Liu, Yuhang Wang, Yong Li, Yongjun Bao, Jinhui Tang, Hanqing Lu

Recent works attempt to improve scene parsing performance by exploring different levels of contexts, and typically train a well-designed convolutional network to exploit useful contexts across all pixels equally.

Scene Parsing Semantic Segmentation

Aligning Linguistic Words and Visual Semantic Units for Image Captioning

1 code implementation6 Aug 2019 Longteng Guo, Jing Liu, Jinhui Tang, Jiangwei Li, Wei Luo, Hanqing Lu

Image captioning attempts to generate a sentence composed of several linguistic words, which are used to describe objects, attributes, and interactions in an image, denoted as visual semantic units in this paper.

Image Captioning

Non-Local Graph Convolutional Networks for Skeleton-Based Action Recognition

1 code implementation arXiv 2019 Lei Shi, Yifan Zhang, Jian Cheng, Hanqing Lu

However, the topology of the graph is set by hand and fixed over all layers, which may be not optimal for the action recognition task and the hierarchical CNN structures.

Action Recognition Skeleton Based Action Recognition

Learning Feature Embeddings for Discriminant Model based Tracking

no code implementations ECCV 2020 Linyu Zheng, Ming Tang, Yingying Chen, Jinqiao Wang, Hanqing Lu

After observing that the features used in most online discriminatively trained trackers are not optimal, in this paper, we propose a novel and effective architecture to learn optimal feature embeddings for online discriminative tracking.

Visual Tracking

Dual Attention Network for Scene Segmentation

12 code implementations CVPR 2019 Jun Fu, Jing Liu, Haijie Tian, Yong Li, Yongjun Bao, Zhiwei Fang, Hanqing Lu

Specifically, we append two types of attention modules on top of traditional dilated FCN, which model the semantic interdependencies in spatial and channel dimensions respectively.

Thermal Image Segmentation

Two-Stream Adaptive Graph Convolutional Networks for Skeleton-Based Action Recognition

4 code implementations CVPR 2019 Lei Shi, Yifan Zhang, Jian Cheng, Hanqing Lu

In addition, the second-order information (the lengths and directions of bones) of the skeleton data, which is naturally more informative and discriminative for action recognition, is rarely investigated in existing methods.

graph construction Skeleton Based Action Recognition +1

Recent Advances in Efficient Computation of Deep Convolutional Neural Networks

no code implementations3 Feb 2018 Jian Cheng, Peisong Wang, Gang Li, Qinghao Hu, Hanqing Lu

As for hardware implementation of deep neural networks, a batch of accelerators based on FPGA/ASIC have been proposed in recent years.

Network Pruning Quantization

Decoding with Value Networks for Neural Machine Translation

no code implementations NeurIPS 2017 Di He, Hanqing Lu, Yingce Xia, Tao Qin, Li-Wei Wang, Tie-Yan Liu

Inspired by the success and methodology of AlphaGo, in this paper we propose using a prediction network to improve beam search, which takes the source sentence $x$, the currently available decoding output $y_1,\cdots, y_{t-1}$ and a candidate word $w$ at step $t$ as inputs and predicts the long-term value (e. g., BLEU score) of the partial target sentence if it is completed by the NMT model.

Machine Translation NMT +1

CONE: Community Oriented Network Embedding

no code implementations5 Sep 2017 Carl Yang, Hanqing Lu, Kevin Chen-Chuan Chang

It is usually modeled as an unsupervised clustering problem on graphs, based on heuristic assumptions about community characteristics, such as edge density and node homogeneity.

Social and Information Networks Physics and Society

Stacked Deconvolutional Network for Semantic Segmentation

no code implementations16 Aug 2017 Jun Fu, Jing Liu, Yuhang Wang, Hanqing Lu

In SDN, multiple shallow deconvolutional networks, which are called as SDN units, are stacked one by one to integrate contextual information and guarantee the fine recovery of localization information.

Semantic Segmentation

CoupleNet: Coupling Global Structure with Local Parts for Object Detection

3 code implementations ICCV 2017 Yousong Zhu, Chaoyang Zhao, Jinqiao Wang, Xu Zhao, Yi Wu, Hanqing Lu

To fully explore the local and global properties, in this paper, we propose a novel fully convolutional network, named as CoupleNet, to couple the global structure with local parts for object detection.

object-detection Object Detection +1

Body Joint guided 3D Deep Convolutional Descriptors for Action Recognition

no code implementations24 Apr 2017 Congqi Cao, Yifan Zhang, Chunjie Zhang, Hanqing Lu

To make it end-to-end and do not rely on any sophisticated body joint detection algorithm, we further propose a two-stream bilinear model which can learn the guidance from the body joints and capture the spatio-temporal features simultaneously.

Action Recognition Temporal Action Localization

Relaxing From Vocabulary: Robust Weakly-Supervised Deep Learning for Vocabulary-Free Image Tagging

no code implementations ICCV 2015 Jianlong Fu, Yue Wu, Tao Mei, Jinqiao Wang, Hanqing Lu, Yong Rui

The development of deep learning has empowered machines with comparable capability of recognizing limited image categories to human beings.

Online Sketching Hashing

no code implementations CVPR 2015 Cong Leng, Jiaxiang Wu, Jian Cheng, Xiao Bai, Hanqing Lu

Recently, hashing based approximate nearest neighbor (ANN) search has attracted much attention.

Fast and Accurate Image Matching with Cascade Hashing for 3D Reconstruction

no code implementations CVPR 2014 Jian Cheng, Cong Leng, Jiaxiang Wu, Hainan Cui, Hanqing Lu

Image matching is one of the most challenging stages in 3D reconstruction, which usually occupies half of computational cost and inaccurate matching may lead to failure of reconstruction.

3D Reconstruction

Weakly-Supervised Dual Clustering for Image Semantic Segmentation

no code implementations CVPR 2013 Yang Liu, Jing Liu, Zechao Li, Jinhui Tang, Hanqing Lu

In this paper, we propose a novel Weakly-Supervised Dual Clustering (WSDC) approach for image semantic segmentation with image-level labels, i. e., collaboratively performing image segmentation and tag alignment with those regions.

Clustering Image Segmentation +3

