Search Results for author: Yuhang Zhang

Found 45 papers, 18 papers with code

Grounded Vision-Language Navigation for UAVs with Open-Vocabulary Goal Understanding

no code implementations12 Jun 2025 Yuhang Zhang, Haosheng Yu, Jiaping Xiao, Mir Feroskhan

Moreover, real-world VLN tasks in indoor and outdoor environments under direct and indirect instructions demonstrate that VLFly achieves robust open-vocabulary goal understanding and generalized navigation capabilities, even in the presence of abstract language input.

Language Modeling Language Modelling +3

FM-Planner: Foundation Model Guided Path Planning for Autonomous Drone Navigation

1 code implementation27 May 2025 Jiaping Xiao, Cheng Wen Tsao, Yuhang Zhang, Mir Feroskhan

Path planning is a critical component in autonomous drone operations, enabling safe and efficient navigation through complex environments.

Benchmarking Decision Making +1

Depth-Sensitive Soft Suppression with RGB-D Inter-Modal Stylization Flow for Domain Generalization Semantic Segmentation

no code implementations11 May 2025 Binbin Wei, Yuhang Zhang, Shishun Tian, Muxin Liao, Wei Li, Wenbin Zou

Hence, we propose a novel framework, namely Depth-Sensitive Soft Suppression with RGB-D inter-modal stylization flow (DSSS), focusing on learning domain-invariant features from depth maps for the DG semantic segmentation.

Domain Generalization Semantic Segmentation +1

Difference-in-Differences Meets Synthetic Control: Doubly Robust Identification and Estimation

no code implementations14 Mar 2025 Yixiao Sun, Haitian Xie, Yuhang Zhang

Our integrated approach provides a doubly robust identification strategy for causal effects in panel data with a group structure, identifying the average treatment effect on the treated (ATT) under either the parallel trends assumption or the group-level SC assumption.

Causal Inference

Refining Alignment Framework for Diffusion Models with Intermediate-Step Preference Ranking

no code implementations1 Feb 2025 Jie Ren, Yuhang Zhang, Dongrui Liu, Xiaopeng Zhang, Qi Tian

Direct preference optimization (DPO) has shown success in aligning diffusion models with human preference.

Evaluating Human Perception of Novel View Synthesis: Subjective Quality Assessment of Gaussian Splatting and NeRF in Dynamic Scenes

no code implementations13 Jan 2025 Yuhang Zhang, Joshua Maraval, Zhengyu Zhang, Nicolas Ramin, Shishun Tian, Lu Zhang

To address these challenges, we conducted two subjective experiments for the quality assessment of NVS technologies containing both GS-based and NeRF-based methods, focusing on dynamic and real-world scenes.

NeRF Novel View Synthesis

Fleximo: Towards Flexible Text-to-Human Motion Video Generation

no code implementations29 Nov 2024 Yuhang Zhang, Yuan Zhou, Zeyu Liu, Yuxuan Cai, Qiuyue Wang, Aidong Men, Huan Yang

Current methods for generating human motion videos rely on extracting pose sequences from reference videos, which restricts flexibility and control.

Image to Video Generation Large Language Model +1

Generalizable Facial Expression Recognition

1 code implementation20 Aug 2024 Yuhang Zhang, Xiuqi Zheng, Chenyi Liang, Jiani Hu, Weihong Deng

To preserve the generalization ability of CLIP and the high precision of the FER model, we design a novel approach that learns sigmoid masks based on the fixed CLIP face features to extract expression features.

Domain Adaptation Facial Expression Recognition +2

DreamStory: Open-Domain Story Visualization by LLM-Guided Multi-Subject Consistent Diffusion

no code implementations17 Jul 2024 Huiguo He, Huan Yang, Zixi Tuo, Yuan Zhou, Qiuyue Wang, Yuhang Zhang, Zeyu Liu, Wenhao Huang, Hongyang Chao, Jian Yin

DreamStory consists of (1) an LLM acting as a story director and (2) an innovative Multi-Subject consistent Diffusion model (MSD) for generating consistent multi-subject across the images.

Descriptive Story Visualization

Lumina-T2X: Transforming Text into Any Modality, Resolution, and Duration via Flow-based Large Diffusion Transformers

2 code implementations9 May 2024 Peng Gao, Le Zhuo, Dongyang Liu, Ruoyi Du, Xu Luo, Longtian Qiu, Yuhang Zhang, Chen Lin, Rongjie Huang, Shijie Geng, Renrui Zhang, Junlin Xi, Wenqi Shao, Zhengkai Jiang, Tianshuo Yang, Weicai Ye, He Tong, Jingwen He, Yu Qiao, Hongsheng Li

Sora unveils the potential of scaling Diffusion Transformer for generating photorealistic images and videos at arbitrary resolutions, aspect ratios, and durations, yet it still lacks sufficient implementation details.

Beyond Traditional Threats: A Persistent Backdoor Attack on Federated Learning

1 code implementation26 Apr 2024 Tao Liu, Yuhang Zhang, Zhu Feng, Zhiqin Yang, Chen Xu, Dapeng Man, Wu Yang

Trained backdoored global model is more resilient to benign updates, leading to a higher attack success rate on the test set.

Backdoor Attack Federated Learning

Faceptor: A Generalist Model for Face Perception

1 code implementation14 Mar 2024 Lixiong Qin, Mei Wang, Xuannan Liu, Yuhang Zhang, Wei Deng, Xiaoshuai Song, Weiran Xu, Weihong Deng

This design enhances the unification of model structure while improving application efficiency in terms of storage overhead.

Age Estimation Attribute +5

Open-Set Facial Expression Recognition

no code implementations23 Jan 2024 Yuhang Zhang, Yue Yao, Xuannan Liu, Lixiong Qin, Wenjing Wang, Weihong Deng

Facial expression recognition (FER) models are typically trained on datasets with a fixed number of seven basic classes.

Facial Expression Recognition Facial Expression Recognition (FER) +1

DeLR: Active Learning for Detection with Decoupled Localization and Recognition Query

no code implementations28 Dec 2023 Yuhang Zhang, Yuang Deng, Xiaopeng Zhang, Jie Li, Robert C. Qiu, Qi Tian

In DeLR, the query is based on region-level, and we only annotate the object region that is queried; 2) Instead of directly providing both localization and recognition annotations, we separately query the two components, and thus reduce the recognition budget with the pseudo class labels provided by the model.

Active Learning Object +2

AdvCloak: Customized Adversarial Cloak for Privacy Protection

no code implementations22 Dec 2023 Xuannan Liu, Yaoyao Zhong, Xing Cui, Yuhang Zhang, Peipei Li, Weihong Deng

This strategy initially focuses on adapting the masks to the unique individual faces via image-specific training and then enhances their feature-level generalization ability to diverse facial variations of individuals via person-specific training.

Vision-based Learning for Drones: A Survey

no code implementations8 Dec 2023 Jiaping Xiao, Rangya Zhang, Yuhang Zhang, Mir Feroskhan

Drones as advanced cyber-physical systems are undergoing a transformative shift with the advent of vision-based learning, a field that is rapidly gaining prominence due to its profound impact on drone autonomy and functionality.

Decision Making Survey

MARVEL: Multi-Agent Reinforcement-Learning for Large-Scale Variable Speed Limits

no code implementations18 Oct 2023 Yuhang Zhang, Marcos Quinones-Grueiro, Zhiyao Zhang, Yanbing Wang, William Barbour, Gautam Biswas, Daniel Work

Variable Speed Limit (VSL) control acts as a promising highway traffic management strategy with worldwide deployment, which can enhance traffic safety by dynamically adjusting speed limits according to real-time traffic conditions.

Decision Making Management +2

Investigating the Robustness and Properties of Detection Transformers (DETR) Toward Difficult Images

no code implementations12 Oct 2023 Zhao Ning Zou, Yuhang Zhang, Robert Wijaya

We studied this issue by measuring the performance of DETR with different experiments and benchmarking the network with convolutional neural network (CNN) based detectors like YOLO and Faster-RCNN.

Benchmarking Decoder +2

Constructing Synthetic Treatment Groups without the Mean Exchangeability Assumption

no code implementations28 Sep 2023 Yuhang Zhang, Yue Liu, Zhihua Zhang

Motivated by the synthetic control method, we construct a synthetic treatment group for the target population by a weighted mixture of treatment groups of source populations.

Calibration-based Dual Prototypical Contrastive Learning Approach for Domain Generalization Semantic Segmentation

1 code implementation25 Sep 2023 Muxin Liao, Shishun Tian, Yuhang Zhang, Guoguang Hua, Wenbin Zou, Xia Li

Based on these observations, a calibration-based dual prototypical contrastive learning (CDPCL) approach is proposed to reduce the domain discrepancy between the learned class-wise features and the prototypes of different domains for domain generalization semantic segmentation.

Contrastive Learning Domain Generalization +1

Enhancing Generalization of Universal Adversarial Perturbation through Gradient Aggregation

1 code implementation ICCV 2023 Xuannan Liu, Yaoyao Zhong, Yuhang Zhang, Lixiong Qin, Weihong Deng

Deep neural networks are vulnerable to universal adversarial perturbation (UAP), an instance-agnostic perturbation capable of fooling the target model for most samples.

Quantization

CAV Traffic Control to Mitigate the Impact of Congestion from Bottlenecks: A Linear Quadratic Regulator Approach and Microsimulation Study

no code implementations17 Jun 2023 Suyash C. Vishnoi, Junyi Ji, MirSaleh Bahavarnia, Yuhang Zhang, Ahmad F. Taha, Christian G. Claudel, Daniel B. Work

The effectiveness of the proposed traffic control algorithms is tested using a traffic control example and compared with existing proportional-integral (PI)- and model predictive control (MPC)- based controllers from the literature.

Model Predictive Control

Detecting Socially Abnormal Highway Driving Behaviors via Recurrent Graph Attention Networks

1 code implementation23 Apr 2023 Yue Hu, Yuhang Zhang, Yanbing Wang, Daniel Work

In this work, we consider the problem of detecting a variety of socially abnormal driving behaviors, i. e., behaviors that do not conform to the behavior of other nearby drivers.

Anomaly Detection Graph Attention

Model and Data Agreement for Learning with Noisy Labels

1 code implementation2 Dec 2022 Yuhang Zhang, Weihong Deng, Xingchen Cui, Yunfeng Yin, Hongzhi Shi, Dongchao Wen

We introduce mean point ensemble to utilize a more robust loss function and more information from unselected samples to reduce error accumulation from the model perspective.

Learning with noisy labels

Learn From All: Erasing Attention Consistency for Noisy Label Facial Expression Recognition

1 code implementation21 Jul 2022 Yuhang Zhang, Chengrui Wang, Xu Ling, Weihong Deng

We find that FER models remember noisy samples by focusing on a part of the features that can be considered related to the noisy labels instead of learning from the whole features that lead to the latent truth.

All Facial Expression Recognition +2

Observer-Based Coordinated Tracking Control for Nonlinear Multi-Agent Systems with Intermittent Communication under Heterogeneous Coupling Framework

no code implementations29 Jun 2022 Yuhang Zhang, Yulian Jiang, Shenquan Wang

In this article, the observer-based coordinated tracking control problem for a class of nonlinear multi-agent systems(MASs) with intermittent communication and information constraints is studied under dynamic switching topology.

LEMMA

Learning Efficient Representations for Enhanced Object Detection on Large-scene SAR Images

no code implementations22 Jan 2022 Siyan Li, Yue Xiao, Yuhang Zhang, Lei Chu, Robert C. Qiu

It is a challenging problem to detect and recognize targets on complex large-scene Synthetic Aperture Radar (SAR) images.

Diversity object-detection +1

One-Bit Active Query With Contrastive Pairs

no code implementations CVPR 2022 Yuhang Zhang, Xiaopeng Zhang, Lingxi Xie, Jie Li, Robert C. Qiu, Hengtong Hu, Qi Tian

The Yes query is treated as positive pairs of the queried category for contrastive pulling, while the No query is treated as hard negative pairs for contrastive repelling.

Active Learning Contrastive Learning

Relative Uncertainty Learning for Facial Expression Recognition

1 code implementation NeurIPS 2021 Yuhang Zhang, Chengrui Wang, Weihong Deng

To quantify these uncertainties and achieve good performance under noisy data, we regard uncertainty as a relative concept and propose an innovative uncertainty learning method called Relative Uncertainty Learning (RUL).

Facial Expression Recognition Facial Expression Recognition (FER)

Steadily Learn to Drive with Virtual Memory

no code implementations16 Feb 2021 Yuhang Zhang, Yao Mu, Yujie Yang, Yang Guan, Shengbo Eben Li, Qi Sun, Jianyu Chen

Reinforcement learning has shown great potential in developing high-level autonomous driving.

Autonomous Driving

DUAL ADVERSARIAL MODEL FOR GENERATING 3D POINT CLOUD

no code implementations25 Sep 2019 Yuhang Zhang, Zhenwei Miao, Tiebin Mi, Robert Caiming Qiu

Three-dimensional data, such as point clouds, are often composed of three coordinates with few featrues.

Generative Adversarial Network model +1

Cannot find the paper you are looking for? You can Submit a new open access paper.