Search Results for author: Yonghao Dang

Found 21 papers, 9 papers with code

3DGAA: Realistic and Robust 3D Gaussian-based Adversarial Attack for Autonomous Driving

no code implementations14 Jul 2025 Yixun Zhang, Lizhi Wang, Junjun Zhao, Wending Zhao, Feng Zhou, Yonghao Dang, Jianqin Yin

In this work, we propose 3D Gaussian-based Adversarial Attack (3DGAA), a novel adversarial object generation framework that leverages the full 14-dimensional parameterization of 3D Gaussian Splatting (3DGS) to jointly optimize geometry and appearance in physically realizable ways.

3DGS Adversarial Attack +1

L2HCount:Generalizing Crowd Counting from Low to High Crowd Density via Density Simulation

no code implementations17 Mar 2025 Guoliang Xu, Jianqin Yin, Ren Zhang, Yonghao Dang, Feng Zhou, Bo Yu

Third, we propose a Dual-Density Memory Encoding Module that uses two crowd memories to learn scene-specific patterns from low- and simulated high-density scenes, respectively.

Crowd Counting

GaussianCAD: Robust Self-Supervised CAD Reconstruction from Three Orthographic Views Using 3D Gaussian Splatting

no code implementations7 Mar 2025 Zheng Zhou, Zhe Li, Bo Yu, Lina Hu, Liang Dong, Zijian Yang, Xiaoli Liu, Ning Xu, Ziwei Wang, Yonghao Dang, Jianqin Yin

While this reformulation offers a promising perspective, existing 3D reconstruction methods typically require natural images and corresponding camera poses as inputs, which introduces two major significant challenges: (1) modality discrepancy between CAD sketches and natural images, and (2) difficulty of accurate camera pose estimation for CAD sketches.

3D Reconstruction CAD Reconstruction +2

QUART-Online: Latency-Free Large Multimodal Language Model for Quadruped Robot Learning

no code implementations20 Dec 2024 Xinyang Tong, Pengxiang Ding, Yiguo Fan, Donglin Wang, Wenjie Zhang, Can Cui, Mingyang Sun, Han Zhao, Hongyin Zhang, Yonghao Dang, Siteng Huang, Shangke Lyu

This paper addresses the inherent inference latency challenges associated with deploying multimodal large language models (MLLM) in quadruped vision-language-action (QUAR-VLA) tasks.

Language Modeling Language Modelling +1

MamKPD: A Simple Mamba Baseline for Real-Time 2D Keypoint Detection

no code implementations2 Dec 2024 Yonghao Dang, Liyuan Liu, Hui Kang, Ping Ye, Jianqin Yin

Moreover, MamKPD achieves state-of-the-art results on the MPII dataset and competitive results on the AP-10K dataset while saving 85% of the parameters compared to ViTPose.

Animal Pose Estimation GPU +2

Towards Physically Realizable Adversarial Attacks in Embodied Vision Navigation

2 code implementations16 Sep 2024 Meng Chen, Jiawei Tu, Chao Qi, Yonghao Dang, Feng Zhou, Wei Wei, Jianqin Yin

To make the patch inconspicuous to human observers, we introduce a two-stage opacity optimization mechanism, in which opacity is fine-tuned after texture optimization.

Adversarial Robustness object-detection +1

ActivityCLIP: Enhancing Group Activity Recognition by Mining Complementary Information from Text to Supplement Image Modality

no code implementations29 Jul 2024 Guoliang Xu, Jianqin Yin, Feng Zhou, Yonghao Dang

Thus, we propose ActivityCLIP, a plug-and-play method for mining the text information contained in the action labels to supplement the image information for enhancing group activity recognition.

Group Activity Recognition Knowledge Distillation +2

Micro-expression recognition based on depth map to point cloud

no code implementations12 Jun 2024 Ren Zhang, Jianqin Yin, Chao Qi, Zehao Wang, Zhicheng Zhang, Yonghao Dang

Conversely, depth information can effectively represent motion information related to facial structure changes and is not affected by lighting.

Micro Expression Recognition Micro-Expression Recognition

Spatial-Temporal Decoupling Contrastive Learning for Skeleton-based Human Action Recognition

1 code implementation23 Dec 2023 Shaojie Zhang, Jianqin Yin, Yonghao Dang

Furthermore, to explicitly exploit the latent data distributions, we employ the attentive features to contrastive learning, which models the cross-sequence semantic relations by pulling together the features from the positive pairs and pushing away the negative pairs.

Action Recognition Contrastive Learning +2

BiHRNet: A Binary high-resolution network for Human Pose Estimation

no code implementations17 Nov 2023 Zhicheng Zhang, Xueyao Sun, Yonghao Dang, Jianqin Yin

On the challenging of COCO dataset, the proposed method enables the binary neural network to achieve 70. 8 mAP, which is better than most tested lightweight full-precision networks.

Binarization Pose Estimation

Physics-constrained Attack against Convolution-based Human Motion Prediction

1 code implementation21 Jun 2023 Chengxu Duan, Zhicheng Zhang, Xiaoli Liu, Yonghao Dang, Jianqin Yin

Specifically, we introduce a novel adaptable scheme that facilitates the attack to suit the scale of the target pose and two physical constraints to enhance the naturalness of the adversarial example.

Adversarial Attack Adversarial Robustness +3

An Improved Baseline Framework for Pose Estimation Challenge at ECCV 2022 Visual Perception for Navigation in Human Environments Workshop

no code implementations13 Mar 2023 Jiajun Fu, Yonghao Dang, Ruoqi Yin, Shaojie Zhang, Feng Zhou, Wending Zhao, Jianqin Yin

This technical report describes our first-place solution to the pose estimation challenge at ECCV 2022 Visual Perception for Navigation in Human Environments Workshop.

Human Detection Pose Estimation

Leveraging the Video-level Semantic Consistency of Event for Audio-visual Event Localization

1 code implementation11 Oct 2022 Yuanyuan Jiang, Jianqin Yin, Yonghao Dang

In contrast to existing methods, we propose a novel video-level semantic consistency guidance network for the AVE localization task.

audio-visual event localization

Kinematics Modeling Network for Video-based Human Pose Estimation

no code implementations22 Jul 2022 Yonghao Dang, Jianqin Yin, Shaojie Zhang, Jiping Liu, Yanzhu Hu

In this work, we propose a plug-and-play kinematics modeling module (KMM) to explicitly model temporal correlations between joints across different frames by calculating their temporal similarity.

Optical Flow Estimation Pose Estimation

Learning Constrained Dynamic Correlations in Spatiotemporal Graphs for Motion Prediction

1 code implementation4 Apr 2022 Jiajun Fu, Fuxing Yang, Yonghao Dang, Xiaoli Liu, Jianqin Yin

The key of DSTD-GC is constrained dynamic correlation modeling, which explicitly parameterizes the common static constraints as a spatial/temporal vanilla adjacency matrix shared by all frames/joints and dynamically extracts correspondence variances for each frame/joint with an adjustment modeling function.

Human motion prediction motion prediction

Relation-Based Associative Joint Location for Human Pose Estimation in Videos

1 code implementation8 Jul 2021 Yonghao Dang, Jianqin Yin, Shaojie Zhang

Moreover, the JRE can infer invisible joints according to the relationship between joints, which is beneficial for the model to locate occluded joints.

Pose Estimation Relation

Energy-based Periodicity Mining with Deep Features for Action Repetition Counting in Unconstrained Videos

no code implementations15 Mar 2020 Jianqin Yin, Yanchun Wu, Huaping Liu, Yonghao Dang, Zhiyi Liu, Jun Liu

Our work features two-fold: 1) An important insight that deep features extracted for action recognition can well model the self-similarity periodicity of the repetitive action is presented.

Action Recognition

DWnet: Deep-Wide Network for 3D Action Recognition

no code implementations29 Aug 2019 Yonghao Dang, Fuxing Yang, Jianqin Yin

We propose in this paper a deep-wide network (DWnet) which combines the deep structure with the broad learning system (BLS) to recognize actions.

3D Action Recognition

Cannot find the paper you are looking for? You can Submit a new open access paper.