no code implementations • 2 Feb 2025 • Borui Xu, Yao Chen, Zeyi Wen, Weiguo Liu, Bingsheng He
This research not only contributes to the understanding of SLMs but also provides practical insights for researchers seeking efficient summarization solutions that balance performance and resource use.
no code implementations • 15 Jan 2025 • Qian Wang, Jiaying Wu, Zhenheng Tang, Bingqiao Luo, Nuo Chen, Wei Chen, Bingsheng He
We argue that advancing LLM-based human simulation requires addressing both LLM's inherent limitations and simulation framework design challenges.
no code implementations • 15 Jan 2025 • Xuanhe Zhou, Wei Zhou, Liguo Qi, Hao Zhang, Dihao Chen, Bingsheng He, Mian Lu, Guoliang Li, Fan Wu, Yuqiang Chen
Efficient and consistent feature computation is crucial for a wide range of online ML applications.
1 code implementation • 18 Dec 2024 • Jun Hu, Bryan Hooi, Bingsheng He, Yinwei Wei
Our results indicate that the optimal $K$ for certain modalities on specific datasets can be as low as 1 or 2, which may restrict the GNNs' capacity to capture global information.
Ranked #1 on
Multi-modal Recommendation
on Amazon Clothing
no code implementations • 16 Dec 2024 • Moming Duan, Rui Zhao, Linshan Jiang, Nigel Shadbolt, Bingsheng He
In this paper, we propose addressing the above challenges along two lines: 1) For license analysis, we have developed a new vocabulary for ML workflow management and encoded license rules to enable ontological reasoning for analyzing rights granting and compliance issues.
no code implementations • 16 Nov 2024 • Wei Zhuo, Zemin Liu, Bryan Hooi, Bingsheng He, Guang Tan, Rizal Fathony, Jia Chen
Label imbalance and homophily-heterophily mixture are the fundamental problems encountered when applying Graph Neural Networks (GNNs) to Graph Fraud Detection (GFD) tasks.
1 code implementation • 23 Oct 2024 • Zhaomin Wu, Junyi Hou, Yiqun Diao, Bingsheng He
To overcome these limitations, we introduce the Federated Transformer (FeT), a novel framework that supports multi-party VFL with fuzzy identifiers.
no code implementations • 16 Oct 2024 • Zhenheng Tang, Xueze Kang, Yiming Yin, Xinglin Pan, Yuxin Wang, Xin He, Qiang Wang, Rongfei Zeng, Kaiyong Zhao, Shaohuai Shi, Amelie Chi Zhou, Bo Li, Bingsheng He, Xiaowen Chu
To alleviate hardware scarcity in training large deep neural networks (DNNs), particularly large language models (LLMs), we present FusionLLM, a decentralized training system designed and implemented for training DNNs using geo-distributed GPUs across different computing clusters or individual devices.
no code implementations • 14 Oct 2024 • Zhaomin Wu, Jizhou Guo, Junyi Hou, Bingsheng He, Lixin Fan, Qiang Yang
As large language models (LLMs) become increasingly prevalent in web services, effectively leveraging domain-specific knowledge while ensuring privacy has become critical.
no code implementations • 14 Oct 2024 • Zhen Qin, Zhaomin Wu, Bingsheng He, Shuiguang Deng
Instruction tuning helps improve pretrained large language models (LLMs) in terms of the responsiveness to human instructions, which is benefited from diversified instruction data.
no code implementations • 30 Sep 2024 • Zining Zhang, Yao Chen, Bingsheng He, Zhenjie Zhang
The increasing size and complexity of Large Language Models (LLMs) pose challenges for their deployment on personal computers and mobile devices.
1 code implementation • 23 Aug 2024 • Qinbin Li, Junyuan Hong, Chulin Xie, Jeffrey Tan, Rachel Xin, Junyi Hou, Xavier Yin, Zhun Wang, Dan Hendrycks, Zhangyang Wang, Bo Li, Bingsheng He, Dawn Song
Large Language Models (LLMs) have become integral to numerous domains, significantly advancing applications in data management, mining, and analysis.
1 code implementation • 9 Jul 2024 • Meihan Liu, Zhen Zhang, Jiachen Tang, Jiajun Bu, Bingsheng He, Sheng Zhou
Unsupervised Graph Domain Adaptation (UGDA) involves the transfer of knowledge from a label-rich source graph to an unlabeled target graph under domain discrepancies.
no code implementations • 27 Jun 2024 • Yuan Li, Bingqiao Luo, Qian Wang, Nuo Chen, Xu Liu, Bingsheng He
The utilization of Large Language Models (LLMs) in financial trading has primarily been concentrated within the stock market, aiding in economic and financial decisions.
1 code implementation • 11 Jun 2024 • Tongjun Shi, Shuhao Zhang, Binbin Chen, Bingsheng He
Stream Learning (SL) requires models that can quickly adapt to continuously evolving data, posing significant challenges in both computational efficiency and learning accuracy.
1 code implementation • 3 Mar 2024 • Zhen Zhang, Meihan Liu, Anhui Wang, Hongyang Chen, Zhao Li, Jiajun Bu, Bingsheng He
Unsupervised Graph Domain Adaptation (UGDA) has emerged as a practical solution to transfer knowledge from a label-rich source graph to a completely unlabelled target graph.
no code implementations • 20 Feb 2024 • Qian Wang, Zemin Liu, Zhen Zhang, Bingsheng He
Class imbalance in graph-structured data, where minor classes are significantly underrepresented, poses a critical challenge for Graph Neural Networks (GNNs).
1 code implementation • 11 Dec 2023 • Yiqun Diao, Qinbin Li, Bingsheng He
However, non-IID data has been a key challenge in FL, which could significantly degrade the accuracy of the final model.
1 code implementation • 23 Oct 2023 • Jun Hu, Bryan Hooi, Bingsheng He
To achieve low information loss, we introduce a Relation-wise Neighbor Collection component with an Even-odd Propagation Scheme, which aims to collect information from neighbors in a finer-grained way.
Ranked #1 on
Heterogeneous Node Classification
on OAG-L1-Field
no code implementations • 18 Oct 2023 • Qinbin Li, Chulin Xie, Xiaojun Xu, Xiaoyuan Liu, Ce Zhang, Bo Li, Bingsheng He, Dawn Song
To address this, we propose HybridTree, a novel federated learning approach that enables federated tree learning on hybrid data.
1 code implementation • 2 Oct 2023 • Qian Wang, Zhen Zhang, Zemin Liu, Shengliang Lu, Bingqiao Luo, Bingsheng He
While numerous public blockchain datasets are available, their utility is constrained by an exclusive focus on blockchain data.
no code implementations • 3 Sep 2023 • Zhenheng Tang, Yuxin Wang, Xin He, Longteng Zhang, Xinglin Pan, Qiang Wang, Rongfei Zeng, Kaiyong Zhao, Shaohuai Shi, Bingsheng He, Xiaowen Chu
The rapid growth of memory and computation requirements of large language models (LLMs) has outpaced the development of hardware, hindering people who lack large-scale high-end GPUs from training or deploying LLMs.
1 code implementation • 29 Aug 2023 • Yiqun Diao, Yutong Yang, Qinbin Li, Bingsheng He, Mian Lu
Thus, a natural question is how those open environment challenges look like and how existing incremental learning algorithms perform on real-world relational data streams.
1 code implementation • 26 Aug 2023 • Zemin Liu, Yuan Li, Nan Chen, Qian Wang, Bryan Hooi, Bingsheng He
However, these methods often suffer from data imbalance, a common issue in graph data where certain segments possess abundant data while others are scarce, thereby leading to biased learning outcomes.
1 code implementation • 5 Jul 2023 • Zhaomin Wu, Junyi Hou, Bingsheng He
However, due to privacy restrictions, few public real-world VFL datasets exist for algorithm evaluation, and these represent a limited array of feature distributions.
2 code implementations • 5 Jul 2023 • Moming Duan, Qinbin Li, Linshan Jiang, Bingsheng He
To fully unleash the potential of FL, we advocate rethinking the design of current FL frameworks and extending it to a more generalized concept: Open Federated Learning Platforms, positioned as a crowdsourcing collaborative machine learning infrastructure for all Internet users.
1 code implementation • 29 Mar 2023 • Sihao Hu, Zhen Zhang, Bingqiao Luo, Shengliang Lu, Bingsheng He, Ling Liu
As various forms of fraud proliferate on Ethereum, it is imperative to safeguard against these malicious activities to protect susceptible users from being victimized.
no code implementations • 21 Nov 2022 • Zining Zhang, Bingsheng He, Zhenjie Zhang
However, due to the gigantic search space and lack of intelligent search guidance, current auto-schedulers require hours to days of tuning time to find the best-performing tensor program for the entire neural network.
1 code implementation • 13 Aug 2022 • Zhaomin Wu, Qinbin Li, Bingsheng He
As societal concerns on data privacy recently increase, we have witnessed data silos among multiple parties in various applications.
1 code implementation • 21 Apr 2022 • Sihao Hu, Zhen Zhang, Shengliang Lu, Bingsheng He, Zhao Li
With the proliferation of pump-and-dump schemes (P&Ds) in the cryptocurrency market, it becomes imperative to detect such fraudulent activities in advance to alert potentially susceptible investors.
no code implementations • 10 Jan 2022 • Ruofan Liang, Bingsheng He, Shengen Yan, Peng Sun
Multi-tenant machine learning services have become emerging data-intensive workloads in data centers with heavy usage of GPU resources.
1 code implementation • 29 Sep 2021 • Qinbin Li, Bingsheng He, Dawn Song
Federated learning has been a popular approach to enable collaborative learning on multiple parties without exchanging raw data.
1 code implementation • 11 Jun 2021 • Zhaomin Wu, Qinbin Li, Bingsheng He
However, most existing studies in VFL disregard the "record linkage" process.
6 code implementations • CVPR 2021 • Qinbin Li, Bingsheng He, Dawn Song
A key challenge in federated learning is to handle the heterogeneity of local data distribution across parties.
no code implementations • 23 Mar 2021 • Johan Kok Zhi Kang, Gaurav, Sien Yi Tan, Feng Cheng, Shixuan Sun, Bingsheng He
The use of deep learning models for forecasting the resource consumption patterns of SQL queries have recently been a popular area of study.
3 code implementations • 3 Feb 2021 • Qinbin Li, Yiqun Diao, Quan Chen, Bingsheng He
We find that non-IID does bring significant challenges in learning accuracy of FL algorithms, and none of the existing state-of-the-art FL algorithms outperforms others in all cases.
1 code implementation • 2 Oct 2020 • Qinbin Li, Bingsheng He, Dawn Song
Federated learning enables multiple parties to collaboratively learn a model without exchanging their data.
no code implementations • 28 Sep 2020 • Qinbin Li, Bingsheng He, Dawn Song
In this paper, we propose a novel federated learning algorithm FedKT that needs only a single communication round (i. e., round-optimal).
1 code implementation • 14 Jun 2020 • Sixu Hu, Yuan Li, Xu Liu, Qinbin Li, Zhaomin Wu, Bingsheng He
This paper presents and characterizes an Open Application Repository for Federated Learning (OARF), a benchmark suite for federated machine learning systems.
2 code implementations • 11 Nov 2019 • Qinbin Li, Zhaomin Wu, Zeyi Wen, Bingsheng He
Specifically, by investigating the property of gradient and the contribution of each tree in GBDTs, we propose to adaptively control the gradients of training data for each iteration and leaf node clipping in order to tighten the sensitivity bounds.
3 code implementations • 11 Nov 2019 • Qinbin Li, Zeyi Wen, Bingsheng He
There have been several recent studies on how to train GBDTs in the federated learning setting.
no code implementations • 8 Nov 2019 • Qinbin Li, Zeyi Wen, Bingsheng He
Our experimental results show that EFU often has 20\% higher hit ratio than LRU in the training with the Gaussian kernel.
1 code implementation • 23 Jul 2019 • Qinbin Li, Zeyi Wen, Zhaomin Wu, Sixu Hu, Naibo Wang, Yuan Li, Xu Liu, Bingsheng He
By systematically summarizing the existing federated learning systems, we present the design factors, case studies, and future research opportunities.
2 code implementations • 3 Jul 2019 • Dawen Xu, Ying Wang, Kaijie Tu, Cheng Liu, Bingsheng He, Lei Zhang
Generative neural network is a new category of neural networks and it has been widely utilized in applications such as content generation, unsupervised learning, segmentation and pose estimation.
no code implementations • 26 Feb 2019 • Chuangyi Gui, Long Zheng, Bingsheng He, Cheng Liu, Xinyu Chen, Xiaofei Liao, Hai Jin
Graph is a well known data structure to represent the associated relationships in a variety of applications, e. g., data science and machine learning.
Distributed, Parallel, and Cluster Computing
no code implementations • 19 Feb 2019 • Junzhe Zhang, Sai Ho Yeung, Yao Shu, Bingsheng He, Wei Wang
They are achieved by exploiting the iterative nature of the training algorithm of deep learning to derive the lifetime and read/write order of all variables.