Search Results for author: YuHang Zhou

Found 39 papers, 15 papers with code

MergeME: Model Merging Techniques for Homogeneous and Heterogeneous MoEs

no code implementations3 Feb 2025 YuHang Zhou, Giannis Karamanolakis, Victor Soto, Anna Rumshisky, Mayank Kulkarni, Furong Huang, Wei Ai, Jianhua Lu

The recent success of specialized Large Language Models (LLMs) in domains such as mathematical reasoning and coding has led to growing interest in methods for merging these expert LLMs into a unified Mixture-of-Experts (MoE) model, with the goal of enhancing performance in each domain while retaining effectiveness on general tasks.

Continual Task Learning through Adaptive Policy Self-Composition

1 code implementation18 Nov 2024 Shengchao Hu, YuHang Zhou, Ziqing Fan, Jifeng Hu, Li Shen, Ya zhang, DaCheng Tao

Training a generalizable agent to continually learn a sequence of tasks from offline trajectories is a natural requirement for long-lived agents, yet remains a significant challenge for current offline reinforcement learning (RL) algorithms.

Continual Learning Offline RL +1

Task-Aware Harmony Multi-Task Decision Transformer for Offline Reinforcement Learning

1 code implementation2 Nov 2024 Ziqing Fan, Shengchao Hu, YuHang Zhou, Li Shen, Ya zhang, Yanfeng Wang, DaCheng Tao

The purpose of offline multi-task reinforcement learning (MTRL) is to develop a unified policy applicable to diverse tasks without the need for online environmental interaction.

Meta-Learning

Revisiting SLO and Goodput Metrics in LLM Serving

no code implementations18 Oct 2024 Zhibin Wang, Shipeng Li, YuHang Zhou, Xue Li, Rong Gu, Nguyen Cam-Tu, Chen Tian, Sheng Zhong

In this paper, we revisit SLO and goodput metrics in LLM serving and propose a unified metric framework smooth goodput including SLOs and goodput to reflect the nature of user experience in LLM serving.

LoRKD: Low-Rank Knowledge Decomposition for Medical Foundation Models

2 code implementations29 Sep 2024 Haolin Li, YuHang Zhou, Ziheng Zhao, Siyuan Du, Jiangchao Yao, Weidi Xie, Ya zhang, Yanfeng Wang

To accomplish the above objective, we propose a novel framework named Low-Rank Knowledge Decomposition (LoRKD), which explicitly separates gradients from different tasks by incorporating low-rank expert modules and efficient knowledge separation convolution.

3D Medical Imaging Segmentation Medical Image Classification

CSRec: Rethinking Sequential Recommendation from A Causal Perspective

1 code implementation23 Aug 2024 Xiaoyu Liu, Jiaxin Yuan, YuHang Zhou, Jingling Li, Furong Huang, Wei Ai

The essence of sequential recommender systems (RecSys) lies in understanding how users make decisions.

Sequential Recommendation

Reprogramming Distillation for Medical Foundation Models

no code implementations9 Jul 2024 YuHang Zhou, Siyuan Du, Haolin Li, Jiangchao Yao, Ya zhang, Yanfeng Wang

However, due to the gap between pre-training tasks (or modalities) and downstream tasks (or modalities), the real-world computation and speed constraints, it might not be straightforward to apply medical foundation models in the downstream scenarios.

Knowledge Distillation parameter-efficient fine-tuning +1

Multimodal Graph Benchmark

1 code implementation24 Jun 2024 Jing Zhu, YuHang Zhou, Shengyi Qian, Zhongmou He, Tong Zhao, Neil Shah, Danai Koutra

Associating unstructured data with structured information is crucial for real-world tasks that require relevance search.

Graph Learning

Multi-Stage Balanced Distillation: Addressing Long-Tail Challenges in Sequence-Level Knowledge Distillation

1 code implementation19 Jun 2024 YuHang Zhou, Jing Zhu, Paiheng Xu, Xiaoyu Liu, Xiyao Wang, Danai Koutra, Wei Ai, Furong Huang

Large language models (LLMs) have significantly advanced various natural language processing tasks, but deploying them remains computationally expensive.

Knowledge Distillation

Exploring Training on Heterogeneous Data with Mixture of Low-rank Adapters

1 code implementation14 Jun 2024 YuHang Zhou, Zihua Zhao, Haolin Li, Siyuan Du, Jiangchao Yao, Ya zhang, Yanfeng Wang

Training a unified model to take multiple targets into account is a trend towards artificial general intelligence.

Teaching-Assistant-in-the-Loop: Improving Knowledge Distillation from Imperfect Teacher Models in Low-Budget Scenarios

no code implementations8 Jun 2024 YuHang Zhou, Wei Ai

The first signal is the student's self-consistency (consistency of student multiple outputs), which is a proxy of the student's confidence.

Knowledge Distillation

Enhancing Visual-Language Modality Alignment in Large Vision Language Models via Self-Improvement

2 code implementations24 May 2024 Xiyao Wang, Jiuhai Chen, Zhaoyang Wang, YuHang Zhou, Yiyang Zhou, Huaxiu Yao, Tianyi Zhou, Tom Goldstein, Parminder Bhatia, Furong Huang, Cao Xiao

In this paper, we propose SIMA, a framework that enhances visual and language modality alignment through self-improvement, eliminating the needs for external models or data.

Hallucination Image Comprehension +2

Low-Rank Knowledge Decomposition for Medical Foundation Models

1 code implementation CVPR 2024 YuHang Zhou, Haolin Li, Siyuan Du, Jiangchao Yao, Ya zhang, Yanfeng Wang

The popularity of large-scale pre-training has promoted the development of medical foundation models.

Multilingual Large Language Model: A Survey of Resources, Taxonomy and Frontiers

no code implementations7 Apr 2024 Libo Qin, Qiguang Chen, YuHang Zhou, Zhi Chen, Yinghui Li, Lizi Liao, Min Li, Wanxiang Che, Philip S. Yu

To this end, in this paper, we present a thorough review and provide a unified perspective to summarize the recent progress as well as emerging trends in multilingual large language models (MLLMs) literature.

Language Modeling Language Modelling +2

SilverSight: A Multi-Task Chinese Financial Large Language Model Based on Adaptive Semantic Space Learning

no code implementations7 Apr 2024 YuHang Zhou, Zeping Li, Siyu Tian, Yuchen Ni, Sen Liu, Guangnan Ye, Hongfeng Chai

Large language models (LLMs) are increasingly being applied across various specialized fields, leveraging their extensive knowledge to empower a multitude of scenarios within these domains.

Language Modeling Language Modelling +1

Meta Invariance Defense Towards Generalizable Robustness to Unknown Adversarial Attacks

no code implementations4 Apr 2024 Lei Zhang, YuHang Zhou, Yi Yang, Xinbo Gao

Despite providing high-performance solutions for computer vision tasks, the deep neural network (DNN) model has been proved to be extremely vulnerable to adversarial attacks.

Adversarial Defense Adversarial Robustness +1

Defense without Forgetting: Continual Adversarial Defense with Anisotropic & Isotropic Pseudo Replay

no code implementations CVPR 2024 YuHang Zhou, Zhongyun Hua

In this paper, we discuss for the first time the concept of continual adversarial defense under a sequence of attacks, and propose a lifelong defense baseline called Anisotropic \& Isotropic Replay (AIR), which offers three advantages: (1) Isotropic replay ensures model consistency in the neighborhood distribution of new data, indirectly aligning the output preference between old and new tasks.

Adversarial Defense

Large Language Models and Causal Inference in Collaboration: A Comprehensive Survey

no code implementations14 Mar 2024 Xiaoyu Liu, Paiheng Xu, Junda Wu, Jiaxin Yuan, Yifan Yang, YuHang Zhou, Fuxiao Liu, Tianrui Guan, Haoliang Wang, Tong Yu, Julian McAuley, Wei Ai, Furong Huang

Causal inference has shown potential in enhancing the predictive accuracy, fairness, robustness, and explainability of Natural Language Processing (NLP) models by capturing causal relationships among variables.

Causal Inference Fairness

From Adoption to Adaption: Tracing the Diffusion of New Emojis on Twitter

no code implementations22 Feb 2024 YuHang Zhou, Xuan Lu, Wei Ai

In the rapidly evolving landscape of social media, the introduction of new emojis in Unicode release versions presents a structured opportunity to explore digital language evolution.

Sentiment Analysis Sentiment Classification

Invariance-powered Trustworthy Defense via Remove Then Restore

no code implementations1 Feb 2024 Xiaowei Fu, YuHang Zhou, Lina Ma, Lei Zhang

Based on this finding, a Pixel Surgery and Semantic Regeneration (PSSR) model following the targeted therapy mechanism is developed, which has three merits: 1) To remove the salient attack, a score-based Pixel Surgery module is proposed, which retains the trivial attack as a kind of invariance information.

Emojis Decoded: Leveraging ChatGPT for Enhanced Understanding in Social Media Communications

no code implementations22 Jan 2024 YuHang Zhou, Paiheng Xu, Xiyao Wang, Xuan Lu, Ge Gao, Wei Ai

Our objective is to validate the hypothesis that ChatGPT can serve as a viable alternative to human annotators in emoji research and that its ability to explain emoji meanings can enhance clarity and transparency in online communications.

Mementos: A Comprehensive Benchmark for Multimodal Large Language Model Reasoning over Image Sequences

1 code implementation19 Jan 2024 Xiyao Wang, YuHang Zhou, Xiaoyu Liu, Hongjin Lu, Yuancheng Xu, Feihong He, Jaehong Yoon, Taixi Lu, Gedas Bertasius, Mohit Bansal, Huaxiu Yao, Furong Huang

However, current MLLM benchmarks are predominantly designed to evaluate reasoning based on static information about a single image, and the ability of modern MLLMs to extrapolate from image sequences, which is essential for understanding our ever-changing world, has been less investigated.

Language Modeling Language Modelling +2

Explore Spurious Correlations at the Concept Level in Language Models for Text Classification

1 code implementation15 Nov 2023 YuHang Zhou, Paiheng Xu, Xiaoyu Liu, Bang An, Wei Ai, Furong Huang

We find that LMs, when encountering spurious correlations between a concept and a label in training or prompts, resort to shortcuts for predictions.

counterfactual In-Context Learning +2

$R^3$-NL2GQL: A Model Coordination and Knowledge Graph Alignment Approach for NL2GQL

1 code implementation3 Nov 2023 YuHang Zhou, Yu He, Siyu Tian, Yuchen Ni, Zhangyue Yin, Xiang Liu, Chuanjun Ji, Sen Liu, Xipeng Qiu, Guangnan Ye, Hongfeng Chai

While current tasks of converting natural language to SQL (NL2SQL) using Foundation Models have shown impressive achievements, adapting these approaches for converting natural language to Graph Query Language (NL2GQL) encounters hurdles due to the distinct nature of GQL compared to SQL, alongside the diverse forms of GQL.

Knowledge Graphs Natural Language Queries +3

Emoji Promotes Developer Participation and Issue Resolution on GitHub

no code implementations30 Aug 2023 YuHang Zhou, Xuan Lu, Ge Gao, Qiaozhu Mei, Wei Ai

In this paper, we study how emoji usage influences developer participation and issue resolution in virtual workspaces.

Causal Inference

Balanced Destruction-Reconstruction Dynamics for Memory-replay Class Incremental Learning

1 code implementation3 Aug 2023 YuHang Zhou, Jiangchao Yao, Feng Hong, Ya zhang, Yanfeng Wang

By dynamically manipulating the gradient during training based on these factors, BDR can effectively alleviate knowledge destruction and improve knowledge reconstruction.

class-incremental learning Class Incremental Learning +1

Pitfalls in Link Prediction with Graph Neural Networks: Understanding the Impact of Target-link Inclusion & Better Practices

no code implementations1 Jun 2023 Jing Zhu, YuHang Zhou, Vassilis N. Ioannidis, Shengyi Qian, Wei Ai, Xiang Song, Danai Koutra

While Graph Neural Networks (GNNs) are remarkably successful in a variety of high-impact applications, we demonstrate that, in link prediction, the common practices of including the edges being predicted in the graph at training and/or test have outsized impact on the performance of low-degree nodes.

Link Prediction Node Classification

GFairHint: Improving Individual Fairness for Graph Neural Networks via Fairness Hint

no code implementations25 May 2023 Paiheng Xu, YuHang Zhou, Bang An, Wei Ai, Furong Huang

Given the growing concerns about fairness in machine learning and the impressive performance of Graph Neural Networks (GNNs) on graph data learning, algorithmic fairness in GNNs has attracted significant attention.

Fairness Link Prediction

Scalable Prompt Generation for Semi-supervised Learning with Language Models

no code implementations18 Feb 2023 YuHang Zhou, Suraj Maharjan, Beiye Liu

In this paper, we propose two methods to automatically design multiple prompts and integrate automatic verbalizer in SSL settings without sacrificing performance.

Few-Shot Learning Natural Language Understanding

Swin MAE: Masked Autoencoders for Small Datasets

1 code implementation28 Dec 2022 Zi'an Xu, Yin Dai, Fayu Liu, Weibing Chen, Yue Liu, Lifu Shi, Sheng Liu, YuHang Zhou

The development of deep learning models in medical image analysis is majorly limited by the lack of large-sized and well-annotated datasets.

Medical Image Analysis Transfer Learning

AutoMine: An Unmanned Mine Dataset

no code implementations CVPR 2022 Yuchen Li, Zixuan Li, Siyu Teng, Yu Zhang, YuHang Zhou, Yuchang Zhu, Dongpu Cao, Bin Tian, Yunfeng Ai, Zhe XuanYuan, Long Chen

The main contributions of the AutoMine dataset are as follows: 1. The first autonomous driving dataset for perception and localization in mine scenarios.

Autonomous Driving

A General Traffic Shaping Protocol in E-Commerce

no code implementations30 Dec 2021 Chenlin Shen, Guangda Huzhang, YuHang Zhou, Chen Liang, Qing Da

Our algorithm can straightforwardly optimize the linear programming in the prime space, and its solution can be simply applied by a stochastic strategy to fulfill the optimized objective and the constraints in expectation.

MS-KD: Multi-Organ Segmentation with Multiple Binary-Labeled Datasets

no code implementations5 Aug 2021 Shixiang Feng, YuHang Zhou, Xiaoman Zhang, Ya zhang, Yanfeng Wang

A novel Multi-teacher Single-student Knowledge Distillation (MS-KD) framework is proposed, where the teacher models are pre-trained single-organ segmentation networks, and the student model is a multi-organ segmentation network.

Knowledge Distillation Organ Segmentation +1

On the Robustness of Domain Adaption to Adversarial Attacks

no code implementations4 Aug 2021 Liyuan Zhang, YuHang Zhou, Lei Zhang

State-of-the-art deep neural networks (DNNs) have been proved to have excellent performance on unsupervised domain adaption (UDA).

Adversarial Attack Pseudo Label +1

Uncertainty-aware Incremental Learning for Multi-organ Segmentation

no code implementations9 Mar 2021 YuHang Zhou, Xiaoman Zhang, Shixiang Feng, Ya zhang, Yanfeng

Specifically, given a pretrained $K$ organ segmentation model and a new single-organ dataset, we train a unified $K+1$ organ segmentation model without accessing any data belonging to the previous training stages.

Ethics Incremental Learning +3

GFL: A Decentralized Federated Learning Framework Based On Blockchain

no code implementations21 Oct 2020 Yifan Hu, YuHang Zhou, Jun Xiao, Chao Wu

Federated learning(FL) is a rapidly growing field and many centralized and decentralized FL frameworks have been proposed.

Data Poisoning Federated Learning

SAR: Scale-Aware Restoration Learning for 3D Tumor Segmentation

no code implementations13 Oct 2020 Xiaoman Zhang, Shixiang Feng, YuHang Zhou, Ya zhang, Yanfeng Wang

We demonstrate the effectiveness of our methods on two downstream tasks: i) Brain tumor segmentation, ii) Pancreas tumor segmentation.

Brain Tumor Segmentation Segmentation +3

Cannot find the paper you are looking for? You can Submit a new open access paper.