Search Results for author: Bolin Ding

Found 52 papers, 27 papers with code

Data-Juicer: A One-Stop Data Processing System for Large Language Models

1 code implementation5 Sep 2023 Daoyuan Chen, Yilun Huang, Zhijian Ma, Hesen Chen, Xuchen Pan, Ce Ge, Dawei Gao, Yuexiang Xie, Zhaoyang Liu, Jinyang Gao, Yaliang Li, Bolin Ding, Jingren Zhou

The immense evolution in Large Language Models (LLMs) has underscored the importance of massive, diverse, and high-quality data.

Distributed Computing

FederatedScope-LLM: A Comprehensive Package for Fine-tuning Large Language Models in Federated Learning

1 code implementation1 Sep 2023 Weirui Kuang, Bingchen Qian, Zitao Li, Daoyuan Chen, Dawei Gao, Xuchen Pan, Yuexiang Xie, Yaliang Li, Bolin Ding, Jingren Zhou

When several entities have similar interested tasks, but their data cannot be shared because of privacy concerns regulations, federated learning (FL) is a mainstream solution to leverage the data of different entities.

Benchmarking Federated Learning +1

Text-to-SQL Empowered by Large Language Models: A Benchmark Evaluation

1 code implementation29 Aug 2023 Dawei Gao, Haibin Wang, Yaliang Li, Xiuyu Sun, Yichen Qian, Bolin Ding, Jingren Zhou

Our explorations highlight open-source LLMs' potential in Text-to-SQL, as well as the advantages and disadvantages of the supervised fine-tuning.

Prompt Engineering Text-To-SQL

Do Emergent Abilities Exist in Quantized Large Language Models: An Empirical Study

1 code implementation16 Jul 2023 Peiyu Liu, Zikang Liu, Ze-Feng Gao, Dawei Gao, Wayne Xin Zhao, Yaliang Li, Bolin Ding, Ji-Rong Wen

Different from previous studies focused on overall performance, this work aims to investigate the impact of quantization on \emph{emergent abilities}, which are important characteristics that distinguish LLMs from small language models.

Instruction Following Quantization

Efficient Personalized Federated Learning via Sparse Model-Adaptation

2 code implementations4 May 2023 Daoyuan Chen, Liuyi Yao, Dawei Gao, Bolin Ding, Yaliang Li

To overcome these challenges, we propose a novel approach named pFedGate for efficient personalized FL by adaptively and efficiently learning sparse local models.

Personalized Federated Learning

FS-Real: Towards Real-World Cross-Device Federated Learning

no code implementations23 Mar 2023 Daoyuan Chen, Dawei Gao, Yuexiang Xie, Xuchen Pan, Zitao Li, Yaliang Li, Bolin Ding, Jingren Zhou

Federated Learning (FL) aims to train high-quality models in collaboration with distributed clients while not uploading their local data, which attracts increasing attention in both academia and industry.

Federated Learning

Lero: A Learning-to-Rank Query Optimizer

1 code implementation14 Feb 2023 Rong Zhu, Wei Chen, Bolin Ding, Xingguang Chen, Andreas Pfadler, Ziniu Wu, Jingren Zhou

In this paper, we introduce a learning-to-rank query optimizer, called Lero, which builds on top of a native query optimizer and continuously learns to improve the optimization performance.

Binary Classification Learning-To-Rank

Revisiting Personalized Federated Learning: Robustness Against Backdoor Attacks

1 code implementation3 Feb 2023 Zeyu Qin, Liuyi Yao, Daoyuan Chen, Yaliang Li, Bolin Ding, Minhao Cheng

We conduct the first study of backdoor attacks in the pFL framework, testing 4 widely used backdoor attacks against 6 pFL methods on benchmark datasets FEMNIST and CIFAR-10, a total of 600 experiments.

Backdoor Attack Personalized Federated Learning

Collaborating Heterogeneous Natural Language Processing Tasks via Federated Learning

1 code implementation12 Dec 2022 Chenhe Dong, Yuexiang Xie, Bolin Ding, Ying Shen, Yaliang Li

In this study, we further broaden the application scope of FL in NLP by proposing an Assign-Then-Contrast (denoted as ATC) framework, which enables clients with heterogeneous NLP tasks to construct an FL course and learn useful knowledge from each other.

Federated Learning Natural Language Understanding +1

Towards Universal Sequence Representation Learning for Recommender Systems

1 code implementation13 Jun 2022 Yupeng Hou, Shanlei Mu, Wayne Xin Zhao, Yaliang Li, Bolin Ding, Ji-Rong Wen

In order to develop effective sequential recommenders, a series of sequence representation learning (SRL) methods are proposed to model historical user behaviors.

Recommendation Systems Representation Learning

FedHPO-B: A Benchmark Suite for Federated Hyperparameter Optimization

1 code implementation8 Jun 2022 Zhen Wang, Weirui Kuang, Ce Zhang, Bolin Ding, Yaliang Li

Due to this uniqueness, existing HPO benchmarks no longer satisfy the need to compare HPO methods in the FL setting.

Benchmarking Federated Learning +1

pFL-Bench: A Comprehensive Benchmark for Personalized Federated Learning

1 code implementation8 Jun 2022 Daoyuan Chen, Dawei Gao, Weirui Kuang, Yaliang Li, Bolin Ding

Personalized Federated Learning (pFL), which utilizes and deploys distinct local models, has gained increasing attention in recent years due to its success in handling the statistical heterogeneity of FL clients.

Fairness Personalized Federated Learning

A Benchmark for Federated Hetero-Task Learning

1 code implementation7 Jun 2022 Liuyi Yao, Dawei Gao, Zhen Wang, Yuexiang Xie, Weirui Kuang, Daoyuan Chen, Haohui Wang, Chenhe Dong, Bolin Ding, Yaliang Li

To investigate the heterogeneity in federated learning in real-world scenarios, we generalize the classic federated learning to federated hetero-task learning, which emphasizes the inconsistency across the participants in federated learning in terms of both data distribution and learning tasks.

Federated Learning Meta-Learning +2

ID-Agnostic User Behavior Pre-training for Sequential Recommendation

no code implementations6 Jun 2022 Shanlei Mu, Yupeng Hou, Wayne Xin Zhao, Yaliang Li, Bolin Ding

Instead of explicitly learning representations for item IDs, IDA-SR directly learns item representations from rich text information.

Language Modelling Sequential Recommendation

EvenNet: Ignoring Odd-Hop Neighbors Improves Robustness of Graph Neural Networks

1 code implementation27 May 2022 Runlin Lei, Zhen Wang, Yaliang Li, Bolin Ding, Zhewei Wei

Despite their extraordinary predictive accuracy, existing approaches, such as GCN and GPRGNN, are not robust in the face of homophily changes on test graphs, rendering these models vulnerable to graph structural attacks and with limited capacity in generalizing to graphs of varied homophily levels.

Node Classification

FederatedScope-GNN: Towards a Unified, Comprehensive and Efficient Package for Federated Graph Learning

1 code implementation12 Apr 2022 Zhen Wang, Weirui Kuang, Yuexiang Xie, Liuyi Yao, Yaliang Li, Bolin Ding, Jingren Zhou

The incredible development of federated learning (FL) has benefited various tasks in the domains of computer vision and natural language processing, and the existing frameworks such as TFF and FATE has made the deployment easy in real-world applications.

Federated Learning Graph Learning

FederatedScope: A Flexible Federated Learning Platform for Heterogeneity

1 code implementation11 Apr 2022 Yuexiang Xie, Zhen Wang, Dawei Gao, Daoyuan Chen, Liuyi Yao, Weirui Kuang, Yaliang Li, Bolin Ding, Jingren Zhou

Although remarkable progress has been made by existing federated learning (FL) platforms to provide infrastructures for development, these platforms may not well tackle the challenges brought by various types of heterogeneity, including the heterogeneity in participants' local data, resources, behaviors and learning goals.

Federated Learning Hyperparameter Optimization

Studying the Impact of Data Disclosure Mechanism in Recommender Systems via Simulation

no code implementations1 Apr 2022 Ziqian Chen, Fei Sun, Yifan Tang, Haokun Chen, Jinyang Gao, Bolin Ding

Then we study users' privacy decision making under different data disclosure mechanisms and recommendation models, and how their data disclosure decisions affect the recommender system's performance.

Decision Making Federated Learning +2

Learning to be a Statistician: Learned Estimator for Number of Distinct Values

1 code implementation6 Feb 2022 Renzhi Wu, Bolin Ding, Xu Chu, Zhewei Wei, Xiening Dai, Tao Guan, Jingren Zhou

We derive conditions of the learning framework under which the learned model is workload agnostic, in the sense that the model/estimator can be trained with synthetically generated training data, and then deployed into any data warehouse simply as, e. g., user-defined functions (UDFs), to offer efficient (within microseconds on CPU) and accurate NDV estimations for unseen tables and workloads.

Recommendation Unlearning

1 code implementation18 Jan 2022 Chong Chen, Fei Sun, Min Zhang, Bolin Ding

From the perspective of utility, if a system's utility is damaged by some bad data, the system needs to forget these data to regain utility.

Recommendation Systems

Baihe: SysML Framework for AI-driven Databases

no code implementations29 Dec 2021 Andreas Pfadler, Rong Zhu, Wei Chen, Botong Huang, Tianjing Zeng, Bolin Ding, Jingren Zhou

Based on the high level architecture, we then describe a concrete implementation of Baihe for PostgreSQL and present example use cases for learned query optimizers.

Towards Personalized Answer Generation in E-Commerce via Multi-Perspective Preference Modeling

1 code implementation27 Dec 2021 Yang Deng, Yaliang Li, Wenxuan Zhang, Bolin Ding, Wai Lam

Recently, Product Question Answering (PQA) on E-Commerce platforms has attracted increasing attention as it can act as an intelligent online shopping assistant and improve the customer shopping experience.

Answer Generation Question Answering

Glue: Adaptively Merging Single Table Cardinality to Estimate Join Query Size

no code implementations7 Dec 2021 Rong Zhu, Tianjing Zeng, Andreas Pfadler, Wei Chen, Bolin Ding, Jingren Zhou

Cardinality estimation (CardEst), a central component of the query optimizer, plays a significant role in generating high-quality query plans in DBMS.

Path-specific Causal Fair Prediction via Auxiliary Graph Structure Learning

no code implementations29 Sep 2021 Liuyi Yao, Yaliang Li, Bolin Ding, Jingren Zhou, Jinduo Liu, Mengdi Huai, Jing Gao

To tackle these challenges, we propose a novel casual graph based fair prediction framework, which integrates graph structure learning into fair prediction to ensure that unfair pathways are excluded in the causal graph.

Fairness Graph structure learning

Learned Index with Dynamic $\epsilon$

no code implementations29 Sep 2021 Daoyuan Chen, Wuchao Li, Yaliang Li, Bolin Ding, Kai Zeng, Defu Lian, Jingren Zhou

We theoretically analyze prediction error bounds that link $\epsilon$ with data characteristics for an illustrative learned index method.


iFlood: A Stable and Effective Regularizer

no code implementations ICLR 2022 Yuexiang Xie, Zhen Wang, Yaliang Li, Ce Zhang, Jingren Zhou, Bolin Ding

However, our further studies uncover that the design of the loss function of Flooding can lead to a discrepancy between its objective and implementation, and cause the instability issue.

Image Classification

Coarformer: Transformer for large graph via graph coarsening

no code implementations29 Sep 2021 Weirui Kuang, Zhen Wang, Yaliang Li, Zhewei Wei, Bolin Ding

We get rid of these obstacles by exploiting the complementary natures of GNN and Transformer, and trade the fine-grained long-range information for the efficiency of Transformer.

Factual Consistency Evaluation for Text Summarization via Counterfactual Estimation

1 code implementation Findings (EMNLP) 2021 Yuexiang Xie, Fei Sun, Yang Deng, Yaliang Li, Bolin Ding

However, existing metrics either neglect the intrinsic cause of the factual inconsistency or rely on auxiliary tasks, leading to an unsatisfied correlation with human judgments or increasing the inconvenience of usage in practice.

Abstractive Text Summarization

VolcanoML: Speeding up End-to-End AutoML via Scalable Search Space Decomposition

3 code implementations19 Jul 2021 Yang Li, Yu Shen, Wentao Zhang, Jiawei Jiang, Bolin Ding, Yaliang Li, Jingren Zhou, Zhi Yang, Wentao Wu, Ce Zhang, Bin Cui

End-to-end AutoML has attracted intensive interests from both academia and industry, which automatically searches for ML pipelines in a space induced by feature engineering, algorithm/model selection, and hyper-parameter tuning.

AutoML Feature Engineering +1

CausCF: Causal Collaborative Filtering for RecommendationEffect Estimation

no code implementations28 May 2021 Xu Xie, Zhaoyang Liu, Shiwen Wu, Fei Sun, Cihang Liu, Jiawei Chen, Jinyang Gao, Bin Cui, Bolin Ding

It is based on the idea that similar users not only have a similar taste on items, but also have similar treatment effect under recommendations.

Collaborative Filtering Recommendation Systems

Unified Conversational Recommendation Policy Learning via Graph-based Reinforcement Learning

no code implementations20 May 2021 Yang Deng, Yaliang Li, Fei Sun, Bolin Ding, Wai Lam

However, existing methods mainly target at solving one or two of these three decision-making problems in CRS with separated conversation and recommendation components, which restrict the scalability and generality of CRS and fall short of preserving a stable training procedure.

Decision Making Recommendation Systems +2

Competitive Information Design for Pandora's Box

no code implementations5 Mar 2021 Bolin Ding, Yiding Feng, Chien-Ju Ho, Wei Tang

We study a natural competitive-information-design variant for the Pandora's Box problem (Weitzman 1979), where each box is associated with a strategic information sender who can design what information about the box's prize value to be revealed to the agent when the agent inspects the box.

Computer Science and Game Theory

FlashP: An Analytical Pipeline for Real-time Forecasting of Time-Series Relational Data

no code implementations9 Jan 2021 Shuyuan Yan, Bolin Ding, Wei Guo, Jingren Zhou, Zhewei Wei, Xiaowei Jiang, Sheng Xu

Our scalable real-time forecasting system FlashP (Flash Prediction) is built based on this idea, with two major challenges to be resolved in this paper: first, we need to figure out how approximate aggregations affect the fitting of forecasting models, and forecasting results; and second, accordingly, what sampling algorithms we should use to obtain these approximate aggregations and how large the samples are.

Time Series Time Series Analysis

A Pluggable Learned Index Method via Sampling and Gap Insertion

no code implementations4 Jan 2021 Yaliang Li, Daoyuan Chen, Bolin Ding, Kai Zeng, Jingren Zhou

In this paper, we propose a formal machine learning based framework to quantify the index learning objective, and study two general and pluggable techniques to enhance the learning efficiency and learning effectiveness for learned indexes.

BIG-bench Machine Learning Retrieval

PURE: An Uncertainty-aware Recommendation Framework for Maximizing Expected Posterior Utility of Platform

no code implementations1 Jan 2021 Haokun Chen, Zhaoyang Liu, Chen Xu, Ziqian Chen, Jinyang Gao, Bolin Ding

In this paper, we propose a novel recommendation framework which effectively utilizes the information of user uncertainty over different item dimensions and explicitly takes into consideration the impact of display policy on user in order to achieve maximal expected posterior utility for the platform.

Learning to Mutate with Hypergradient Guided Population

no code implementations NeurIPS 2020 Zhiqiang Tao, Yaliang Li, Bolin Ding, Ce Zhang, Jingren Zhou, Yun Fu

Computing the gradient of model hyperparameters, i. e., hypergradient, enables a promising and natural way to solve the hyperparameter optimization task.

Hyperparameter Optimization

Scalable Graph Neural Networks via Bidirectional Propagation

1 code implementation NeurIPS 2020 Ming Chen, Zhewei Wei, Bolin Ding, Yaliang Li, Ye Yuan, Xiaoyong Du, Ji-Rong Wen

Most notably, GBP can deliver superior performance on a graph with over 60 million nodes and 1. 8 billion edges in less than half an hour on a single machine.

Graph Sampling

Contrastive Learning for Sequential Recommendation

1 code implementation27 Oct 2020 Xu Xie, Fei Sun, Zhaoyang Liu, Shiwen Wu, Jinyang Gao, Bolin Ding, Bin Cui

Sequential recommendation methods play a crucial role in modern recommender systems because of their ability to capture a user's dynamic interest from her/his historical interactions.

Contrastive Learning Data Augmentation +1

FIVES: Feature Interaction Via Edge Search for Large-Scale Tabular Data

no code implementations29 Jul 2020 Yuexiang Xie, Zhen Wang, Yaliang Li, Bolin Ding, Nezihe Merve Gürel, Ce Zhang, Minlie Huang, Wei. Lin, Jingren Zhou

Then we instantiate this search strategy by optimizing both a dedicated graph neural network (GNN) and the adjacency tensor associated with the defined feature graph.

Recommendation Systems

Simple and Deep Graph Convolutional Networks

4 code implementations ICML 2020 Ming Chen, Zhewei Wei, Zengfeng Huang, Bolin Ding, Yaliang Li

We propose the GCNII, an extension of the vanilla GCN model with two simple yet effective techniques: {\em Initial residual} and {\em Identity mapping}.

Graph Classification Graph Regression +3

Practical Data Poisoning Attack against Next-Item Recommendation

no code implementations7 Apr 2020 Hengtong Zhang, Yaliang Li, Bolin Ding, Jing Gao

In real-world recommendation systems, the cost of retraining recommendation models is high, and the interaction frequency between users and a recommendation system is restricted. Given these real-world restrictions, we propose to let the agent interact with a recommender simulator instead of the target recommendation system and leverage the transferability of the generated adversarial samples to poison the target system.

Data Poisoning Recommendation Systems

AdaBERT: Task-Adaptive BERT Compression with Differentiable Neural Architecture Search

1 code implementation13 Jan 2020 Daoyuan Chen, Yaliang Li, Minghui Qiu, Zhen Wang, Bofang Li, Bolin Ding, Hongbo Deng, Jun Huang, Wei. Lin, Jingren Zhou

Motivated by the necessity and benefits of task-oriented BERT compression, we propose a novel compression method, AdaBERT, that leverages differentiable Neural Architecture Search to automatically compress BERT into task-adaptive small models for specific tasks.

Knowledge Distillation Neural Architecture Search

Automated Relational Meta-learning

1 code implementation ICLR 2020 Huaxiu Yao, Xian Wu, Zhiqiang Tao, Yaliang Li, Bolin Ding, Ruirui Li, Zhenhui Li

In order to efficiently learn with small amount of data on new tasks, meta-learning transfers knowledge learned from previous tasks to the new ones.

Few-Shot Image Classification Meta-Learning

Improving Utility and Security of the Shuffler-based Differential Privacy

1 code implementation30 Aug 2019 Tianhao Wang, Bolin Ding, Min Xu, Zhicong Huang, Cheng Hong, Jingren Zhou, Ninghui Li, Somesh Jha

When collecting information, local differential privacy (LDP) alleviates privacy concerns of users because their private information is randomized before being sent it to the central aggregator.

Continuous Integration of Machine Learning Models with Towards a Rigorous Yet Practical Treatment

no code implementations1 Mar 2019 Cedric Renggli, Bojan Karlaš, Bolin Ding, Feng Liu, Kevin Schawinski, Wentao Wu, Ce Zhang

Continuous integration is an indispensable step of modern software engineering practices to systematically manage the life cycles of system development.

BIG-bench Machine Learning

ABC: Efficient Selection of Machine Learning Configuration on Large Dataset

no code implementations8 Nov 2018 Silu Huang, Chi Wang, Bolin Ding, Surajit Chaudhuri

A machine learning configuration refers to a combination of preprocessor, learner, and hyperparameters.

BIG-bench Machine Learning

Towards Differentially Private Truth Discovery for Crowd Sensing Systems

no code implementations10 Oct 2018 Yaliang Li, Houping Xiao, Zhan Qin, Chenglin Miao, Lu Su, Jing Gao, Kui Ren, Bolin Ding

To better utilize sensory data, the problem of truth discovery, whose goal is to estimate user quality and infer reliable aggregated results through quality-aware data aggregation, has emerged as a hot topic.

Privacy Preserving

Collecting Telemetry Data Privately

no code implementations NeurIPS 2017 Bolin Ding, Janardhan Kulkarni, Sergey Yekhanin

In particular, existing LDP algorithms are not suitable for repeated collection of counter data such as daily app usage statistics.

Learn-Memorize-Recall-Reduce A Robotic Cloud Computing Paradigm

no code implementations16 Apr 2017 Shaoshan Liu, Bolin Ding, Jie Tang, Dawei Sun, Zhe Zhang, Grace Tsai, Jean-Luc Gaudiot

The rise of robotic applications has led to the generation of a huge volume of unstructured data, whereas the current cloud infrastructure was designed to process limited amounts of structured data.

Cloud Computing Memorization

Cannot find the paper you are looking for? You can Submit a new open access paper.