PP-YOLOv2: A Practical Object Detector

1 code implementation21 Apr 2021 Xin Huang, Xinxin Wang, Wenyu Lv, Xiaying Bai, Xiang Long, Kaipeng Deng, Qingqing Dang, Shumin Han, Qiwen Liu, Xiaoguang Hu, dianhai yu, Yanjun Ma, Osamu Yoshie

To meet these two concerns, we comprehensively evaluate a collection of existing refinements to improve the performance of PP-YOLO while almost keep the infer time unchanged.

Exploring Text-transformers in AAAI 2021 Shared Task: COVID-19 Fake News Detection in English

1 code implementation7 Jan 2021 Xiangyang Li, Yu Xia, Xiang Long, Zheng Li, Sujian Li

In this paper, we describe our system for the AAAI 2021 shared task of COVID-19 Fake News Detection in English, where we achieved the 3rd position with the weighted F1 score of 0. 9859 on the test set.

Fake News Detection

RSPNet: Relative Speed Perception for Unsupervised Video Representation Learning

1 code implementation27 Oct 2020 Peihao Chen, Deng Huang, Dongliang He, Xiang Long, Runhao Zeng, Shilei Wen, Mingkui Tan, Chuang Gan

We study unsupervised video representation learning that seeks to learn both motion and appearance features from unlabeled video only, which can be reused for downstream tasks such as action recognition.

Representation Learning Self-Supervised Action Recognition +1

PP-YOLO: An Effective and Efficient Implementation of Object Detector

5 code implementations23 Jul 2020 Xiang Long, Kaipeng Deng, Guanzhong Wang, Yang Zhang, Qingqing Dang, Yuan Gao, Hui Shen, Jianguo Ren, Shumin Han, Errui Ding, Shilei Wen

We mainly try to combine various existing tricks that almost not increase the number of model parameters and FLOPs, to achieve the goal of improving the accuracy of detector as much as possible while ensuring that the speed is almost unchanged.

Object Detection

Graph-PCNN: Two Stage Human Pose Estimation with Graph Pose Refinement

no code implementations ECCV 2020 Jian Wang, Xiang Long, Yuan Gao, Errui Ding, Shilei Wen

In the first stage, heatmap regression network is applied to obtain a rough localization result, and a set of proposal keypoints, called guided points, are sampled.

Pose Estimation

FenceMask: A Data Augmentation Approach for Pre-extracted Image Features

no code implementations14 Jun 2020 Pu Li, Xiang-Yang Li, Xiang Long

It is based on the 'simulation of object occlusion' strategy, which aim to achieve the balance between object occlusion and information retention of the input data.

Data Augmentation Fine-Grained Visual Categorization

Cross-Modality Attention with Semantic Graph Embedding for Multi-Label Classification

no code implementations17 Dec 2019 Renchun You, Zhiyao Guo, Lei Cui, Xiang Long, Yingze Bao, Shilei Wen

In order to overcome these challenges, we propose to use cross-modality attention with semantic graph embedding for multi label classification.

Classification General Classification +4

Multi-Label Classification with Label Graph Superimposing

2 code implementations21 Nov 2019 Ya Wang, Dongliang He, Fu Li, Xiang Long, Zhichao Zhou, Jinwen Ma, Shilei Wen

In this paper, we propose a label graph superimposing framework to improve the conventional GCN+CNN framework developed for multi-label recognition in the following two aspects.

Classification General Classification +2

Deep Concept-wise Temporal Convolutional Networks for Action Localization

2 code implementations26 Aug 2019 Xin Li, Tianwei Lin, Xiao Liu, Chuang Gan, WangMeng Zuo, Chao Li, Xiang Long, Dongliang He, Fu Li, Shilei Wen

In this paper, we empirically find that stacking more conventional temporal convolution layers actually deteriorates action classification performance, possibly ascribing to that all channels of 1D feature map, which generally are highly abstract and can be regarded as latent concepts, are excessively recombined in temporal convolution.

Action Classification Action Localization

Exploiting Spatial-Temporal Modelling and Multi-Modal Fusion for Human Action Recognition

no code implementations27 Jun 2018 Dongliang He, Fu Li, Qijie Zhao, Xiang Long, Yi Fu, Shilei Wen

In this challenge, we propose spatial-temporal network (StNet) for better joint spatial-temporal modelling and comprehensively video understanding.

Action Recognition Video Understanding

Attention Clusters: Purely Attention Based Local Feature Integration for Video Classification

2 code implementations CVPR 2018 Xiang Long, Chuang Gan, Gerard de Melo, Jiajun Wu, Xiao Liu, Shilei Wen

In this paper, however, we show that temporal information, especially longer-term patterns, may not be necessary to achieve competitive results on common video classification datasets.

Classification General Classification +1

Revisiting the Effectiveness of Off-the-shelf Temporal Modeling Approaches for Large-scale Video Classification

no code implementations12 Aug 2017 Yunlong Bian, Chuang Gan, Xiao Liu, Fu Li, Xiang Long, Yandong Li, Heng Qi, Jie zhou, Shilei Wen, Yuanqing Lin

Experiment results on the challenging Kinetics dataset demonstrate that our proposed temporal modeling approaches can significantly improve existing approaches in the large-scale video recognition tasks.

Action Classification Fine-tuning +3

Temporal Modeling Approaches for Large-scale Youtube-8M Video Understanding

1 code implementation14 Jul 2017 Fu Li, Chuang Gan, Xiao Liu, Yunlong Bian, Xiang Long, Yandong Li, Zhichao Li, Jie zhou, Shilei Wen

This paper describes our solution for the video recognition task of the Google Cloud and YouTube-8M Video Understanding Challenge that ranked the 3rd place.

Video Recognition Video Understanding

Semi-Federated Scheduling of Parallel Real-Time Tasks on Multiprocessors

no code implementations9 May 2017 Xu Jiang, Nan Guan, Xiang Long, Wang Yi

In this paper we propose the semi-federate scheduling approach, which only grants $x$ dedicated processors to a heavy task with processing capacity requirement $x + \epsilon$, and schedules the remaining $\epsilon$ part together with light tasks on shared processors.

Distributed, Parallel, and Cluster Computing

Video Captioning with Multi-Faceted Attention

no code implementations TACL 2018 Xiang Long, Chuang Gan, Gerard de Melo

Recently, video captioning has been attracting an increasing amount of interest, due to its potential for improving accessibility and information retrieval.

Information Retrieval Video Captioning

