Search Results for author: Huadong Ma

Found 42 papers, 19 papers with code

XR-VLM: Cross-Relationship Modeling with Multi-part Prompts and Visual Features for Fine-Grained Recognition

no code implementations10 Mar 2025 Chuanming Wang, Henming Mao, Huanhuan Zhang, Huiyuan Fu, Huadong Ma

To further enhance discriminative capability, we propose a cross relationship modeling pattern that combines visual feature with all class prompt features, enabling a deeper exploration of the relationships between these two modalities.

Fine-Grained Visual Recognition

Robust Disentangled Counterfactual Learning for Physical Audiovisual Commonsense Reasoning

2 code implementations18 Feb 2025 Mengshi Qi, Changsheng Lv, Huadong Ma

Furthermore, we introduce a counterfactual learning module to augment the model's reasoning ability by modeling physical knowledge relationships among different objects under counterfactual intervention.

counterfactual

Target-driven Self-Distillation for Partial Observed Trajectories Forecasting

no code implementations28 Jan 2025 Pengfei Zhu, Peng Shu, Mengshi Qi, Liang Liu, Huadong Ma

This involves firstly training a fully observed model and then using a distillation process to create the final model.

Knowledge Distillation Motion Forecasting

Towards Robust Unsupervised Attention Prediction in Autonomous Driving

no code implementations25 Jan 2025 Mengshi Qi, Xiaoyang Bi, Pengfei Zhu, Huadong Ma

Robustly predicting attention regions of interest for self-driving systems is crucial for driving safety but presents significant challenges due to the labor-intensive nature of obtaining large-scale attention labels and the domain gap between self-driving scenarios and natural scenes.

Autonomous Driving Data Augmentation +1

A New Teacher-Reviewer-Student Framework for Semi-supervised 2D Human Pose Estimation

no code implementations16 Jan 2025 Wulian Yun, Mengshi Qi, Fei Peng, Huadong Ma

Secondly, we introduce a Multi-level Feature Learning strategy, which utilizes the outputs from different stages of the backbone to estimate the heatmap to guide network training, enriching the supervisory information while effectively capturing keypoint relationships.

2D Human Pose Estimation Data Augmentation +1

Towards Balanced Continual Multi-Modal Learning in Human Pose Estimation

no code implementations9 Jan 2025 Jiaxuan Peng, Mengshi Qi, Dong Zhao, Huadong Ma

In this work, we introduce a novel balanced continual multi-modal learning method for 3D HPE, which harnesses the power of RGB, LiDAR, mmWave, and WiFi.

3D Human Pose Estimation 3D Pose Estimation +2

Action Quality Assessment via Hierarchical Pose-guided Multi-stage Contrastive Regression

1 code implementation7 Jan 2025 Mengshi Qi, Hao Ye, Jiaxuan Peng, Huadong Ma

Firstly, we introduce a multi-scale dynamic visual-skeleton encoder to capture fine-grained spatio-temporal visual and skeletal features.

Action Quality Assessment Contrastive Learning +1

Learning Group Interactions and Semantic Intentions for Multi-Object Trajectory Prediction

1 code implementation20 Dec 2024 Mengshi Qi, Yuxin Yang, Huadong Ma

Effective modeling of group interactions and dynamic semantic intentions is crucial for forecasting behaviors like trajectories or movements.

Prediction Trajectory Prediction

Improving Batch Normalization with TTA for Robust Object Detection in Self-Driving

no code implementations28 Nov 2024 Dacheng Liao, Mengshi Qi, Liang Liu, Huadong Ma

In current open real-world autonomous driving scenarios, challenges such as sensor failure and extreme weather conditions hinder the generalization of most autonomous driving perception models to these unseen domain due to the domain shifts between the test and training data.

Autonomous Driving object-detection +2

T2SG: Traffic Topology Scene Graph for Topology Reasoning in Autonomous Driving

no code implementations28 Nov 2024 Changsheng Lv, Mengshi Qi, Liang Liu, Huadong Ma

Understanding the traffic scenes and then generating high-definition (HD) maps present significant challenges in autonomous driving.

Autonomous Driving counterfactual

PITN: Physics-Informed Temporal Networks for Cuffless Blood Pressure Estimation

1 code implementation16 Aug 2024 Rui Wang, Mengshi Qi, Yingxia Shao, Anfu Zhou, Huadong Ma

To tackle this challenge, we introduce a novel physics-informed temporal network~(PITN) with adversarial contrastive learning to enable precise BP estimation with very limited data.

Blood pressure estimation Contrastive Learning +1

Semi-Supervised Teacher-Reference-Student Architecture for Action Quality Assessment

no code implementations29 Jul 2024 Wulian Yun, Mengshi Qi, Fei Peng, Huadong Ma

Differing from the traditional teacher-student network, we propose a teacher-reference-student architecture to learn both unlabeled and labeled data, where the teacher network and the reference network are used to generate pseudo-labels for unlabeled data to supervise the student network.

Action Quality Assessment

Decomposed Vector-Quantized Variational Autoencoder for Human Grasp Generation

1 code implementation19 Jul 2024 Zhe Zhao, Mengshi Qi, Huadong Ma

Generating realistic human grasps is a crucial yet challenging task for applications involving object manipulation in computer graphics and robotics.

Grasp Generation

Frequency-based Matcher for Long-tailed Semantic Segmentation

1 code implementation6 Jun 2024 Shan Li, Lu Yang, Pu Cao, Liulei Li, Huadong Ma

The successful application of semantic segmentation technology in the real world has been among the most exciting achievements in the computer vision community over the past decade.

Autonomous Driving object-detection +3

NTIRE 2024 Quality Assessment of AI-Generated Content Challenge

no code implementations25 Apr 2024 Xiaohong Liu, Xiongkuo Min, Guangtao Zhai, Chunyi Li, Tengchuan Kou, Wei Sun, HaoNing Wu, Yixuan Gao, Yuqin Cao, ZiCheng Zhang, Xiele Wu, Radu Timofte, Fei Peng, Huiyuan Fu, Anlong Ming, Chuanming Wang, Huadong Ma, Shuai He, Zifei Dou, Shu Chen, Huacong Zhang, Haiyi Xie, Chengwei Wang, Baoying Chen, Jishen Zeng, Jianquan Yang, Weigang Wang, Xi Fang, Xiaoxin Lv, Jun Yan, Tianwu Zhi, Yabin Zhang, Yaohui Li, Yang Li, Jingwen Xu, Jianzhao Liu, Yiting Liao, Junlin Li, Zihao Yu, Yiting Lu, Xin Li, Hossein Motamednia, S. Farhad Hosseini-Benvidi, Fengbin Guan, Ahmad Mahmoudi-Aznaveh, Azadeh Mansouri, Ganzorig Gankhuyag, Kihwan Yoon, Yifang Xu, Haotian Fan, Fangyuan Kong, Shiling Zhao, Weifeng Dong, Haibing Yin, Li Zhu, Zhiling Wang, Bingchen Huang, Avinab Saha, Sandeep Mishra, Shashank Gupta, Rajesh Sureddi, Oindrila Saha, Luigi Celona, Simone Bianco, Paolo Napoletano, Raimondo Schettini, Junfeng Yang, Jing Fu, Wei zhang, Wenzhi Cao, Limei Liu, Han Peng, Weijun Yuan, Zhan Li, Yihang Cheng, Yifan Deng, Haohui Li, Bowen Qu, Yao Li, Shuqing Luo, Shunzhou Wang, Wei Gao, Zihao Lu, Marcos V. Conde, Xinrui Wang, Zhibo Chen, Ruling Liao, Yan Ye, Qiulin Wang, Bing Li, Zhaokun Zhou, Miao Geng, Rui Chen, Xin Tao, Xiaoyu Liang, Shangkun Sun, Xingyuan Ma, Jiaze Li, Mengduo Yang, Haoran Xu, Jie zhou, Shiding Zhu, Bohan Yu, Pengfei Chen, Xinrui Xu, Jiabin Shen, Zhichao Duan, Erfan Asadi, Jiahe Liu, Qi Yan, Youran Qu, Xiaohui Zeng, Lele Wang, Renjie Liao

A total of 196 participants have registered in the video track.

Image Quality Assessment Image Restoration +2

G3R: Generating Rich and Fine-grained mmWave Radar Data from 2D Videos for Generalized Gesture Recognition

no code implementations23 Apr 2024 Kaikai Deng, Dong Zhao, Wenxin Zheng, Yue Ling, Kangwen Yin, Huadong Ma

Millimeter wave radar is gaining traction recently as a promising modality for enabling pervasive and privacy-preserving gesture recognition.

Decoder Gesture Recognition +1

SM4Depth: Seamless Monocular Metric Depth Estimation across Multiple Cameras and Scenes by One Model

2 code implementations13 Mar 2024 Yihao Liu, Feng Xue, Anlong Ming, Mingshuai Zhao, Huadong Ma, Nicu Sebe

Firstly, to obtain consistent depth across diverse scenes, we propose a novel metric scale modeling, i. e., variation-based unnormalized depth bins.

Depth Estimation

Region-Aware Exposure Consistency Network for Mixed Exposure Correction

1 code implementation28 Feb 2024 Jin Liu, Huiyuan Fu, Chuanming Wang, Huadong Ma

Exposure correction aims to enhance images suffering from improper exposure to achieve satisfactory visual effects.

Exposure Correction

Learning Exposure Correction in Dynamic Scenes

1 code implementation27 Feb 2024 Jin Liu, Bo wang, Chuanming Wang, Huiyuan Fu, Huadong Ma

Exposure correction aims to enhance visual data suffering from improper exposures, which can greatly improve satisfactory visual effects.

Exposure Correction Video Enhancement

Mutual Distillation Learning For Person Re-Identification

1 code implementation12 Jan 2024 Huiyuan Fu, Kuilong Cui, Chuanming Wang, Mengshi Qi, Huadong Ma

With the rapid advancements in deep learning technologies, person re-identification (ReID) has witnessed remarkable performance improvements.

Hard Attention Person Re-Identification

Uncovering the human motion pattern: Pattern Memory-based Diffusion Model for Trajectory Prediction

no code implementations5 Jan 2024 Yuxin Yang, Pengfei Zhu, Mengshi Qi, Huadong Ma

To uncover latent motion patterns in human behavior, we introduce a novel memory-based method, named Motion Pattern Priors Memory Network.

Autonomous Driving Retrieval +1

Multi-Stage Contrastive Regression for Action Quality Assessment

1 code implementation5 Jan 2024 Qi An, Mengshi Qi, Huadong Ma

In recent years, there has been growing interest in the video-based action quality assessment (AQA).

Action Quality Assessment Contrastive Learning +1

Towards Efficient Object Re-Identification with A Novel Cloud-Edge Collaborative Framework

no code implementations4 Jan 2024 Chuanming Wang, Yuxin Yang, Mengshi Qi, Huadong Ma

Object re-identification (ReID) is committed to searching for objects of the same identity across cameras, and its real-world deployment is gradually increasing.

Collaborative Inference Object

Continuous Optical Zooming: A Benchmark for Arbitrary-Scale Image Super-Resolution in Real World

1 code implementation CVPR 2024 Huiyuan Fu, Fei Peng, Xianwei Li, Yejun Li, Xin Wang, Huadong Ma

The extensive experiments demonstrate the superior performance of the arbitrary-scale SR models trained on the COZ dataset compared to models trained on simulated data.

Image Super-Resolution Meta-Learning

VIoTGPT: Learning to Schedule Vision Tools in LLMs towards Intelligent Video Internet of Things

1 code implementation1 Dec 2023 Yaoyao Zhong, Mengshi Qi, Rui Wang, Yuhan Qiu, Yang Zhang, Huadong Ma

Video Internet of Things (VIoT) has shown full potential in collecting an unprecedented volume of video data.

Weakly-Supervised Temporal Action Localization by Inferring Salient Snippet-Feature

1 code implementation22 Mar 2023 Wulian Yun, Mengshi Qi, Chuanming Wang, Huadong Ma

Weakly-supervised temporal action localization aims to locate action regions and identify action categories in untrimmed videos simultaneously by taking only video-level labels as the supervision.

Pseudo Label Weakly-supervised Temporal Action Localization +1

SGFormer: Semantic Graph Transformer for Point Cloud-based 3D Scene Graph Generation

1 code implementation20 Mar 2023 Changsheng Lv, Mengshi Qi, Xia Li, Zhengyuan Yang, Huadong Ma

In this paper, we propose a novel model called SGFormer, Semantic Graph TransFormer for point cloud-based 3D scene graph generation.

3d scene graph generation Graph Embedding +3

Thinking Image Color Aesthetics Assessment: Models, Datasets and Benchmarks

1 code implementation ICCV 2023 Shuai He, Anlong Ming, Yaqi Li, Jinyuan Sun, Shuntian Zheng, Huadong Ma

We present a comprehensive study on a new task named image color aesthetics assessment (ICAA), which aims to assess color aesthetics based on human perception.

Image Quality Assessment

You Do Not Need Additional Priors or Regularizers in Retinex-Based Low-Light Image Enhancement

no code implementations CVPR 2023 Huiyuan Fu, Wenkai Zheng, Xiangyu Meng, Xin Wang, Chuanming Wang, Huadong Ma

The Retinex-based methods require decomposing the image into reflectance and illumination components, which is a highly ill-posed problem and there is no available ground truth.

Contrastive Learning Low-Light Image Enhancement +1

Dancing in the Dark: A Benchmark towards General Low-light Video Enhancement

1 code implementation ICCV 2023 Huiyuan Fu, Wenkai Zheng, Xicong Wang, Jiaxuan Wang, Heng Zhang, Huadong Ma

To address this issue, we design a camera system and collect a high-quality low-light video dataset with multiple exposures and cameras.

Video Enhancement

Coarse-to-Fine Video Denoising with Dual-Stage Spatial-Channel Transformer

no code implementations30 Apr 2022 Wulian Yun, Mengshi Qi, Chuanming Wang, Huiyuan Fu, Huadong Ma

Meanwhile, we design a Multi-Scale Residual Structure to preserve multiple aspects of information at different stages, which contains a Temporal Features Aggregation Module to summarize the dynamic representation.

Denoising Video Denoising

Learning to Help Emergency Vehicles Arrive Faster: A Cooperative Vehicle-Road Scheduling Approach

no code implementations20 Feb 2022 Lige Ding, Dong Zhao, Zhaofeng Wang, Guang Wang, Chang Tan, Lei Fan, Huadong Ma

The ever-increasing heavy traffic congestion potentially impedes the accessibility of emergency vehicles (EVs), resulting in detrimental impacts on critical services and even safety of people's lives.

Graph Attention Scheduling +1

SPAP: Simultaneous Demand Prediction and Planning for Electric Vehicle Chargers in a New City

1 code implementation18 Oct 2021 Yizong Wang, Dong Zhao, Yajie Ren, Desheng Zhang, Huadong Ma

A direct idea is to leverage the urban transfer learning paradigm to learn the knowledge from a source city, then exploit it to predict charging demands, and meanwhile determine locations and amounts of slow/fast chargers for charging stations in the target city.

Domain Adaptation Prediction +1

Language Guided Networks for Cross-modal Moment Retrieval

no code implementations18 Jun 2020 Kun Liu, Huadong Ma, Chuang Gan

In this paper, we present Language Guided Networks (LGN), a new framework that leverages the sentence embedding to guide the whole process of moment retrieval.

Moment Retrieval Retrieval +3

A Real-time Action Representation with Temporal Encoding and Deep Compression

no code implementations17 Jun 2020 Kun Liu, Wu Liu, Huadong Ma, Mingkui Tan, Chuang Gan

Our method achieves clear improvements on UCF101 action recognition benchmark against state-of-the-art real-time methods by 5. 4% in terms of accuracy and 2 times faster in terms of inference speed with a less than 5MB storage model.

Action Recognition

MemNet: Memory-Efficiency Guided Neural Architecture Search with Augment-Trim learning

no code implementations22 Jul 2019 Peiye Liu, Bo Wu, Huadong Ma, Mingoo Seok

Recent studies on automatic neural architectures search have demonstrated significant performance, competitive to or even better than hand-crafted neural architectures.

Neural Architecture Search

PVSS: A Progressive Vehicle Search System for Video Surveillance Networks

no code implementations10 Jan 2019 Xinchen Liu, Wu Liu, Huadong Ma, Shuangqun Li

In this paper, a Progressive Vehicle Search System, named as PVSS, is designed to solve the above problems.

Attribute Triplet

KTAN: Knowledge Transfer Adversarial Network

no code implementations18 Oct 2018 Peiye Liu, Wu Liu, Huadong Ma, Tao Mei, Mingoo Seok

To transfer the knowledge of intermediate representations, we set high-level teacher feature maps as a target, toward which the student feature maps are trained.

Image Classification Knowledge Distillation +3

Generalized Zero-Shot Learning for Action Recognition with Web-Scale Video Data

no code implementations20 Oct 2017 Kun Liu, Wu Liu, Huadong Ma, Wenbing Huang, Xiongxiong Dong

Motivated by this, we study the task of action recognition in surveillance video under a more realistic \emph{generalized zero-shot setting}, where testing data contains both seen and unseen classes.

Action Recognition Generalized Zero-Shot Learning +1

Cannot find the paper you are looking for? You can Submit a new open access paper.