1 code implementation • 29 Apr 2025 • Ziqing Fan, Cheng Liang, Chaoyi Wu, Ya zhang, Yanfeng Wang, Weidi Xie
Recent advances in reasoning-enhanced large language models (LLMs) and multimodal LLMs (MLLMs) have significantly improved performance in complex tasks, yet medical AI models often overlook the structured reasoning processes inherent in clinical practice.
1 code implementation • 18 Nov 2024 • Shengchao Hu, YuHang Zhou, Ziqing Fan, Jifeng Hu, Li Shen, Ya zhang, DaCheng Tao
Training a generalizable agent to continually learn a sequence of tasks from offline trajectories is a natural requirement for long-lived agents, yet remains a significant challenge for current offline reinforcement learning (RL) algorithms.
1 code implementation • 2 Nov 2024 • Ziqing Fan, Shengchao Hu, YuHang Zhou, Li Shen, Ya zhang, Yanfeng Wang, DaCheng Tao
The purpose of offline multi-task reinforcement learning (MTRL) is to develop a unified policy applicable to diverse tasks without the need for online environmental interaction.
no code implementations • 18 Jul 2024 • Pingjie Wang, Ziqing Fan, Shengchao Hu, Zhe Chen, Yanfeng Wang, Yu Wang
Structured pruning is a promising hardware-friendly compression technique for large language models (LLMs), which is expected to be retraining-free to avoid the enormous retraining cost.
1 code implementation • NeurIPS 2023 • Ziqing Fan, Ruipeng Zhang, Jiangchao Yao, Bo Han, Ya zhang, Yanfeng Wang
Partially class-disjoint data (PCDD), a common yet under-explored data formation where each client contributes a part of classes (instead of all classes) of samples, severely challenges the performance of federated algorithms.
1 code implementation • 29 May 2024 • Ruipeng Zhang, Ziqing Fan, Jiangchao Yao, Ya zhang, Yanfeng Wang
This paper presents a Domain-Inspired Sharpness-Aware Minimization (DISAM) algorithm for optimization under domain shifts.
1 code implementation • 29 May 2024 • Ziqing Fan, Jiangchao Yao, Ruipeng Zhang, Lingjuan Lyu, Ya zhang, Yanfeng Wang
Statistical heterogeneity severely limits the performance of federated learning (FL), motivating several explorations e. g., FedProx, MOON and FedDyn, to alleviate this problem.
1 code implementation • 29 May 2024 • Ziqing Fan, Shengchao Hu, Jiangchao Yao, Gang Niu, Ya zhang, Masashi Sugiyama, Yanfeng Wang
However, the local loss landscapes may not accurately reflect the flatness of global loss landscape in heterogeneous environments; as a result, minimizing local sharpness and calculating perturbations on client data might not align the efficacy of SAM in FL with centralized training.
1 code implementation • 28 May 2024 • Shengchao Hu, Ziqing Fan, Li Shen, Ya zhang, Yanfeng Wang, DaCheng Tao
However, variations in task content and complexity pose significant challenges in policy formulation, necessitating judicious parameter sharing and management of conflicting gradients for optimal policy performance.
2 code implementations • 27 May 2024 • Shengchao Hu, Ziqing Fan, Chaoqin Huang, Li Shen, Ya zhang, Yanfeng Wang, DaCheng Tao
Recent advancements in offline reinforcement learning (RL) have underscored the capabilities of Conditional Sequence Modeling (CSM), a paradigm that learns the action distribution based on history trajectory and target returns for each state.
1 code implementation • 14 Dec 2022 • Ziqing Fan, Yanfeng Wang, Jiangchao Yao, Lingjuan Lyu, Ya zhang, Qi Tian
However, in addition to previous explorations for improvement in federated averaging, our analysis shows that another critical bottleneck is the poorer optima of client models in more heterogeneous conditions.
no code implementations • 9 Aug 2021 • Rui Cao, Ziqing Fan, Roy Ka-Wei Lee, Wen-Haw Chong, Jing Jiang
Our experiment results show that DisMultiHate is able to outperform state-of-the-art unimodal and multimodal baselines in the hateful meme classification task.
Ranked #7 on
Hateful Meme Classification
on HarMeme