Search Results for author: LiMin Wang

Found 30 papers, 17 papers with code

A Closer Look at Few-Shot Video Classification: A New Baseline and Benchmark

no code implementations24 Oct 2021 Zhenxi Zhu, LiMin Wang, Sheng Guo, Gangshan Wu

In this paper, we aim to present an in-depth study on few-shot video classification by making three contributions.

Classification Meta-Learning +2

End-to-End Dense Video Grounding via Parallel Regression

no code implementations23 Sep 2021 Fengyuan Shi, LiMin Wang, Weilin Huang

In this paper, we tackle a new problem of dense video grounding, by simultaneously localizing multiple moments with a paragraph as input.

Mutual Supervision for Dense Object Detection

no code implementations ICCV 2021 Ziteng Gao, LiMin Wang, Gangshan Wu

In this paper, we break the convention of the same training samples for these two heads in dense detectors and explore a novel supervisory paradigm, termed as Mutual Supervision (MuSu), to respectively and mutually assign training samples for the classification and regression head to ensure this consistency.

Classification Dense Object Detection

Negative Sample Matters: A Renaissance of Metric Learning for Temporal Grounding

no code implementations10 Sep 2021 Zhenzhi Wang, LiMin Wang, Tao Wu, TianHao Li, Gangshan Wu

Instead, from a perspective on temporal grounding as a metric-learning problem, we present a Dual Matching Network (DMN), to directly model the relations between language queries and video moments in a joint embedding space.

Metric Learning Representation Learning

Self Supervision to Distillation for Long-Tailed Visual Recognition

no code implementations ICCV 2021 TianHao Li, LiMin Wang, Gangshan Wu

In this paper, we show that soft label can serve as a powerful solution to incorporate label correlation into a multi-stage training scheme for long-tailed recognition.

Target Adaptive Context Aggregation for Video Scene Graph Generation

1 code implementation ICCV 2021 Yao Teng, LiMin Wang, Zhifeng Li, Gangshan Wu

Specifically, we design an efficient method for frame-level VidSGG, termed as {\em Target Adaptive Context Aggregation Network} (TRACE), with a focus on capturing spatio-temporal context information for relation recognition.

Graph Generation Scene Graph Generation

Structured Sparse R-CNN for Direct Scene Graph Generation

no code implementations21 Jun 2021 Yao Teng, LiMin Wang

The key to our method is a set of learnable triplet queries and structured triplet detectors which could be jointly optimized from the training set in an end-to-end manner.

graph construction Graph Generation +3

CGA-Net: Category Guided Aggregation for Point Cloud Semantic Segmentation

1 code implementation CVPR 2021 Tao Lu, LiMin Wang, Gangshan Wu

Previous point cloud semantic segmentation networks use the same process to aggregate features from neighbors of the same category and different categories.

Semantic Segmentation

Joint Landmark and Structure Learning for Automatic Evaluation of Developmental Dysplasia of the Hip

no code implementations10 Jun 2021 Xindi Hu, LiMin Wang, Xin Yang, Xu Zhou, Wufeng Xue, Yan Cao, Shengfeng Liu, Yuhao Huang, Shuangping Guo, Ning Shang, Dong Ni, Ning Gu

In this study, we propose a multi-task framework to learn the relationships among landmarks and structures jointly and automatically evaluate DDH.

SADRNet: Self-Aligned Dual Face Regression Networks for Robust 3D Dense Face Alignment and Reconstruction

1 code implementation6 Jun 2021 Zeyu Ruan, Changqing Zou, Longhai Wu, Gangshan Wu, LiMin Wang

Three-dimensional face dense alignment and reconstruction in the wild is a challenging problem as partial facial information is commonly missing in occluded and large pose face images.

3D Face Alignment 3D Face Reconstruction +2

FineAction: A Fine-Grained Video Dataset for Temporal Action Localization

no code implementations24 May 2021 Yi Liu, LiMin Wang, Xiao Ma, Yali Wang, Yu Qiao

Second, the coarse action classes often lead to the ambiguous annotations of temporal boundaries, which are inappropriate for temporal action localization.

Temporal Action Localization Temporal Localization +1

MGSampler: An Explainable Sampling Strategy for Video Action Recognition

2 code implementations ICCV 2021 Yuan Zhi, Zhan Tong, LiMin Wang, Gangshan Wu

First, we present two different motion representations to enable us to efficiently distinguish the motion-salient frames from the background.

Action Recognition

Target Transformed Regression for Accurate Tracking

1 code implementation1 Apr 2021 Yutao Cui, Cheng Jiang, LiMin Wang, Gangshan Wu

Accurate tracking is still a challenging task due to appearance variations, pose and view changes, and geometric deformations of target in videos.

Visual Object Tracking Visual Tracking

Relaxed Transformer Decoders for Direct Action Proposal Generation

2 code implementations ICCV 2021 Jing Tan, Jiaqi Tang, LiMin Wang, Gangshan Wu

Extensive experiments on THUMOS14 and ActivityNet-1. 3 benchmarks demonstrate the effectiveness of RTD-Net, on both tasks of temporal action proposal generation and temporal action detection.

Action Detection Temporal Action Proposal Generation +1

Temporal Difference Networks for Action Recognition

no code implementations1 Jan 2021 LiMin Wang, Bin Ji, Zhan Tong, Gangshan Wu

To mitigate this issue, this paper presents a new video architecture, termed as Temporal Difference Network (TDN), with a focus on capturing multi-scale temporal information for efficient action recognition.

Action Recognition Action Recognition In Videos +1

TDN: Temporal Difference Networks for Efficient Action Recognition

1 code implementation CVPR 2021 LiMin Wang, Zhan Tong, Bin Ji, Gangshan Wu

To mitigate this issue, this paper presents a new video architecture, termed as Temporal Difference Network (TDN), with a focus on capturing multi-scale temporal information for efficient action recognition.

 Ranked #1 on Action Recognition on Something-Something V2 (using extra training data)

Action Classification Action Recognition +2

Appearance-and-Relation Networks for Video Classification

1 code implementation CVPR 2018 Limin Wang, Wei Li, Wen Li, Luc van Gool

Specifically, SMART blocks decouple the spatiotemporal learning module into an appearance branch for spatial modeling and a relation branch for temporal modeling.

Action Classification Action Recognition +2

Temporal Segment Networks for Action Recognition in Videos

8 code implementations8 May 2017 Limin Wang, Yuanjun Xiong, Zhe Wang, Yu Qiao, Dahua Lin, Xiaoou Tang, Luc van Gool

Furthermore, based on the temporal segment networks, we won the video classification track at the ActivityNet challenge 2016 among 24 teams, which demonstrates the effectiveness of TSN and the proposed good practices.

Ranked #14 on Action Classification on Moments in Time (Top 5 Accuracy metric)

Action Classification Action Recognition +2

UntrimmedNets for Weakly Supervised Action Recognition and Detection

2 code implementations CVPR 2017 Limin Wang, Yuanjun Xiong, Dahua Lin, Luc van Gool

We exploit the learned models for action recognition (WSR) and detection (WSD) on the untrimmed video datasets of THUMOS14 and ActivityNet.

Weakly Supervised Action Localization Weakly-Supervised Action Recognition

Knowledge Guided Disambiguation for Large-Scale Scene Classification with Multi-Resolution CNNs

2 code implementations4 Oct 2016 Limin Wang, Sheng Guo, Weilin Huang, Yuanjun Xiong, Yu Qiao

Convolutional Neural Networks (CNNs) have made remarkable progress on scene recognition, partially due to these recent large-scale scene datasets, such as the Places and Places2.

General Classification Scene Classification +1

Transferring Object-Scene Convolutional Neural Networks for Event Recognition in Still Images

no code implementations1 Sep 2016 Limin Wang, Zhe Wang, Yu Qiao, Luc van Gool

These newly designed transferring techniques exploit multi-task learning frameworks to incorporate extra knowledge from other networks and additional datasets into the training procedure of event CNNs.

Multi-Task Learning

Temporal Segment Networks: Towards Good Practices for Deep Action Recognition

18 code implementations2 Aug 2016 Limin Wang, Yuanjun Xiong, Zhe Wang, Yu Qiao, Dahua Lin, Xiaoou Tang, Luc van Gool

The other contribution is our study on a series of good practices in learning ConvNets on video data with the help of temporal segment network.

Action Classification Action Recognition +3

Actionness Estimation Using Hybrid Fully Convolutional Networks

no code implementations CVPR 2016 Limin Wang, Yu Qiao, Xiaoou Tang, Luc van Gool

Actionness was introduced to quantify the likelihood of containing a generic action instance at a specific location.

Action Detection Action Recognition

Better Exploiting OS-CNNs for Better Event Recognition in Images

no code implementations14 Oct 2015 Limin Wang, Zhe Wang, Sheng Guo, Yu Qiao

Event recognition from still images is one of the most important problems for image understanding.

Object Recognition Scene Recognition

Places205-VGGNet Models for Scene Recognition

1 code implementation7 Aug 2015 Limin Wang, Sheng Guo, Weilin Huang, Yu Qiao

We verify the performance of trained Places205-VGGNet models on three datasets: MIT67, SUN397, and Places205.

Object Recognition Scene Recognition

Towards Good Practices for Very Deep Two-Stream ConvNets

5 code implementations8 Jul 2015 Limin Wang, Yuanjun Xiong, Zhe Wang, Yu Qiao

However, for action recognition in videos, the improvement of deep convolutional networks is not so evident.

Action Recognition Action Recognition In Videos +2

Object-Scene Convolutional Neural Networks for Event Recognition in Images

no code implementations2 May 2015 Limin Wang, Zhe Wang, Wenbin Du, Yu Qiao

Meanwhile, we investigate different network architectures for OS-CNN design, and adapt the deep (AlexNet) and very-deep (GoogLeNet) networks to the task of event recognition.

Cannot find the paper you are looking for? You can Submit a new open access paper.