Search Results for author: Jian-Fang Hu

Found 14 papers, 1 papers with code

Selective Hourglass Mapping for Universal Image Restoration Based on Diffusion Model

1 code implementation • 17 Mar 2024 • Dian Zheng, Xiao-Ming Wu, Shuzhou Yang, Jian Zhang, Jian-Fang Hu, Wei-Shi Zheng

Universal image restoration is a practical and potential computer vision task for real-world applications.

Image Restoration Zero-shot Generalization

Paper
Code

Latent Embeddings for Collective Activity Recognition

no code implementations • 20 Sep 2017 • Yongyi Tang, Peizhen Zhang, Jian-Fang Hu, Wei-Shi Zheng

Rather than simply recognizing the action of a person individually, collective activity recognition aims to find out what a group of people is acting in a collective scene.

Activity Recognition

Paper
Add Code

Improving Fast Segmentation With Teacher-student Learning

no code implementations • 19 Oct 2018 • Jiafeng Xie, Bing Shuai, Jian-Fang Hu, Jingyang Lin, Wei-Shi Zheng

Recently, segmentation neural networks have been significantly improved by demonstrating very promising accuracies on public benchmarks.

Segmentation

Paper
Add Code

Deep Bilinear Learning for RGB-D Action Recognition

no code implementations • ECCV 2018 • Jian-Fang Hu, Wei-Shi Zheng, Jia-Hui Pan, Jian-Huang Lai, Jian-Guo Zhang

In this paper, we focus on exploring modality-temporal mutual information for RGB-D action recognition.

Action Recognition Temporal Action Localization

Paper
Add Code

Jointly learning heterogeneous features for rgb-d activity recognition

no code implementations • IEEE Transactions on Pattern Analysis and Machine Intelligence ( Volume: 39 , Issue: 11 , Nov. 1 2017 ) 2016 • Jian-Fang Hu, Wei-Shi Zheng, Jian-Huang Lai, Jian-Guo Zhang

The proposed model formed in a unified framework is capable of: 1) jointly mining a set of subspaces with the same dimensionality to exploit latent shared features across different feature channels, 2) meanwhile, quantifying the shared and feature-specific components of features in the subspaces, and 3) transferring feature-specific intermediate transforms (i-transforms) for learning fusion of heterogeneous features across datasets.

Ranked #8 on Skeleton Based Action Recognition on SYSU 3D

Activity Recognition Benchmarking +3

Paper
Add Code

Early action prediction by soft regression

no code implementations • IEEE Transactions on Pattern Analysis and Machine Intelligence 2018 • Jian-Fang Hu, Wei-Shi Zheng, Lianyang Ma, Gang Wang, Jian-Huang Lai, Jian-Guo Zhang

Our formulation of soft regression framework 1) overcomes a usual assumption in existing early action prediction systems that the progress level of on-going sequence is given in the testing stage; and 2) presents a theoretical framework to better resolve the ambiguity and uncertainty of subsequences at early performing stage.

Ranked #70 on Skeleton Based Action Recognition on NTU RGB+D 120

Early Action Prediction regression +1

Paper
Add Code

Augmented 2D-TAN: A Two-stage Approach for Human-centric Spatio-Temporal Video Grounding

no code implementations • 20 Jun 2021 • Chaolei Tan, Zihang Lin, Jian-Fang Hu, Xiang Li, Wei-Shi Zheng

We propose an effective two-stage approach to tackle the problem of language-based Human-centric Spatio-Temporal Video Grounding (HC-STVG) task.

Spatio-Temporal Video Grounding Video Grounding

Paper
Add Code

Predictive Feature Learning for Future Segmentation Prediction

no code implementations • ICCV 2021 • Zihang Lin, Jiangxin Sun, Jian-Fang Hu, QiZhi Yu, Jian-Huang Lai, Wei-Shi Zheng

In the latent feature learned by the autoencoder, global structures are enhanced and local details are suppressed so that it is more predictive.

Segmentation

Paper
Add Code

Action-guided 3D Human Motion Prediction

no code implementations • NeurIPS 2021 • Jiangxin Sun, Zihang Lin, Xintong Han, Jian-Fang Hu, Jia Xu, Wei-Shi Zheng

The ability of forecasting future human motion is important for human-machine interaction systems to understand human behaviors and make interaction.

Human motion prediction motion prediction

Paper
Add Code

STVGFormer: Spatio-Temporal Video Grounding with Static-Dynamic Cross-Modal Understanding

no code implementations • 6 Jul 2022 • Zihang Lin, Chaolei Tan, Jian-Fang Hu, Zhi Jin, Tiancai Ye, Wei-Shi Zheng

The static branch performs cross-modal understanding in a single frame and learns to localize the target object spatially according to intra-frame visual cues like object appearances.

Ranked #2 on Spatio-Temporal Video Grounding on HC-STVG2

Spatio-Temporal Video Grounding Video Grounding

Paper
Add Code

Collaborative Static and Dynamic Vision-Language Streams for Spatio-Temporal Video Grounding

no code implementations • CVPR 2023 • Zihang Lin, Chaolei Tan, Jian-Fang Hu, Zhi Jin, Tiancai Ye, Wei-Shi Zheng

The static stream performs cross-modal understanding in a single frame and learns to attend to the target object spatially according to intra-frame visual cues like object appearances.

Object Spatio-Temporal Video Grounding +1

Paper
Add Code

Hierarchical Semantic Correspondence Networks for Video Paragraph Grounding

no code implementations • CVPR 2023 • Chaolei Tan, Zihang Lin, Jian-Fang Hu, Wei-Shi Zheng, JianHuang Lai

Specifically, we develop a hierarchical encoder that encodes the multi-modal inputs into semantics-aligned representations at different levels.

Sentence Video Grounding

Paper
Add Code

Siamese Learning with Joint Alignment and Regression for Weakly-Supervised Video Paragraph Grounding

no code implementations • 18 Mar 2024 • Chaolei Tan, JianHuang Lai, Wei-Shi Zheng, Jian-Fang Hu

Different from previous weakly-supervised grounding frameworks based on multiple instance learning or reconstruction learning for two-stage candidate ranking, we propose a novel siamese learning framework that jointly learns the cross-modal feature alignment and temporal coordinate regression without timestamp labels to achieve concise one-stage localization for WSVPG.

Multiple Instance Learning

Paper
Add Code

Ranking Distillation for Open-Ended Video Question Answering with Insufficient Labels

no code implementations • 21 Mar 2024 • Tianming Liang, Chaolei Tan, Beihao Xia, Wei-Shi Zheng, Jian-Fang Hu

This paper focuses on open-ended video question answering, which aims to find the correct answers from a large answer set in response to a video-related question.

Multi-Label Classification Question Answering +1

Paper
Add Code

Cannot find the paper you are looking for? You can Submit a new open access paper.