Search Results for author: Chuan Wu

Found 31 papers, 11 papers with code

BG-HGNN: Toward Scalable and Efficient Heterogeneous Graph Neural Network

no code implementations13 Mar 2024 Junwei Su, Lingjun Mao, Chuan Wu

Many computer vision and machine learning problems are modelled as learning tasks on heterogeneous graphs, featuring a wide array of relations from diverse types of nodes and edges.

Relation

Improving Implicit Regularization of SGD with Preconditioning for Least Square Problems

no code implementations13 Mar 2024 Junwei Su, Difan Zou, Chuan Wu

In this paper, we study the generalization performance of SGD with preconditioning for the least squared problem.

regression

On the Topology Awareness and Generalization Performance of Graph Neural Networks

no code implementations7 Mar 2024 Junwei Su, Chuan Wu

Using this framework, we investigate the effects of topology awareness on GNN generalization performance.

Active Learning

LoRA Meets Dropout under a Unified Framework

no code implementations25 Feb 2024 Sheng Wang, Liheng Chen, Jiyue Jiang, Boyang Xue, Lingpeng Kong, Chuan Wu

Hence, a possible contradiction arises from negligible trainable parameters of LoRA and the effectiveness of previous dropout methods, which has been largely overlooked.

PRoLoRA: Partial Rotation Empowers More Parameter-Efficient LoRA

no code implementations24 Feb 2024 Sheng Wang, Boyang Xue, Jiacheng Ye, Jiyue Jiang, Liheng Chen, Lingpeng Kong, Chuan Wu

Hopefully, the conspicuously higher parameter efficiency can establish PRoLoRA as a resource-friendly alternative to LoRA.

MSPipe: Efficient Temporal GNN Training via Staleness-aware Pipeline

1 code implementation23 Feb 2024 Guangming Sheng, Junwei Su, Chao Huang, Chuan Wu

However, the iterative reading and updating process of the memory module in MTGNNs to obtain up-to-date information needs to follow the temporal dependencies.

Scheduling

Towards Robust Graph Incremental Learning on Evolving Graphs

no code implementations20 Feb 2024 Junwei Su, Difan Zou, Zijun Zhang, Chuan Wu

We provide a formal formulation and analysis of the problem, and propose a novel regularization-based technique called Structural-Shift-Risk-Mitigation (SSRM) to mitigate the impact of the structural shift on catastrophic forgetting of the inductive NGIL problem.

Incremental Learning

Diffusion of Thoughts: Chain-of-Thought Reasoning in Diffusion Language Models

1 code implementation12 Feb 2024 Jiacheng Ye, Shansan Gong, Liheng Chen, Lin Zheng, Jiahui Gao, Han Shi, Chuan Wu, Zhenguo Li, Wei Bi, Lingpeng Kong

This work explores the integration of diffusion models and Chain-of-Thought (CoT), a well-established technique to improve the reasoning ability in autoregressive language models.

Math

PRES: Toward Scalable Memory-Based Dynamic Graph Neural Networks

no code implementations6 Feb 2024 Junwei Su, Difan Zou, Chuan Wu

Memory-based Dynamic Graph Neural Networks (MDGNNs) are a family of dynamic graph neural networks that leverage a memory module to extract, distill, and memorize long-term temporal dependencies, leading to superior performance compared to memory-less counterparts.

GNNFlow: A Distributed Framework for Continuous Temporal GNN Learning on Dynamic Graphs

1 code implementation29 Nov 2023 Yuchen Zhong, Guangming Sheng, Tianzuo Qin, Minjie Wang, Quan Gan, Chuan Wu

We introduce GNNFlow, a distributed framework that enables efficient continuous temporal graph representation learning on dynamic graphs on multi-GPU machines.

Graph Learning Graph Representation Learning +1

DynaPipe: Optimizing Multi-task Training through Dynamic Pipelines

2 code implementations17 Nov 2023 Chenyu Jiang, Zhen Jia, Shuai Zheng, Yida Wang, Chuan Wu

This paper proposes a dynamic micro-batching approach to tackle sequence length variation and enable efficient multi-task model training.

Language Modelling Large Language Model +3

CDMPP: A Device-Model Agnostic Framework for Latency Prediction of Tensor Programs

1 code implementation16 Nov 2023 Hanpeng Hu, Junwei Su, Juntao Zhao, Yanghua Peng, Yibo Zhu, Haibin Lin, Chuan Wu

Considering the large space of DNN models and devices that impede direct profiling of all combinations, recent efforts focus on building a predictor to model the performance of DNN models on different devices.

Domain Adaptation

Adaptive Message Quantization and Parallelization for Distributed Full-graph GNN Training

1 code implementation2 Jun 2023 Borui Wan, Juntao Zhao, Chuan Wu

Distributed full-graph training of Graph Neural Networks (GNNs) over large graphs is bandwidth-demanding and time-consuming.

Quantization

A Cognitive Stimulation Dialogue System with Multi-source Knowledge Fusion for Elders with Cognitive Impairment

no code implementations14 May 2023 Jiyue Jiang, Sheng Wang, Qintong Li, Lingpeng Kong, Chuan Wu

In this paper, we propose a multi-source knowledge fusion method for CS dialogue (CSD), to generate open-ended responses guided by the CS principle and emotional support strategy.

Auto-Parallelizing Large Models with Rhino: A Systematic Approach on Production AI Platform

no code implementations16 Feb 2023 Shiwei Zhang, Lansong Diao, Siyu Wang, Zongyan Cao, Yiliang Gu, Chang Si, Ziji Shi, Zhen Zheng, Chuan Wu, Wei Lin

We present Rhino, a system for accelerating tensor programs with automatic parallelization on AI platform for real production environment.

Expediting Distributed DNN Training with Device Topology-Aware Graph Deployment

no code implementations13 Feb 2023 Shiwei Zhang, Xiaodong Yi, Lansong Diao, Chuan Wu, Siyu Wang, Wei Lin

This paper presents TAG, an automatic system to derive optimized DNN training graph and its deployment onto any device topology, for expedited training in device- and topology- heterogeneous ML clusters.

Combinatorial Optimization TAG

Towards Robust Inductive Graph Incremental Learning via Experience Replay

no code implementations7 Feb 2023 Junwei Su, Chuan Wu

Inductive node-wise graph incremental learning is a challenging task due to the dynamic nature of evolving graphs and the dependencies between nodes.

Incremental Learning

MixSynthFormer: A Transformer Encoder-like Structure with Mixed Synthetic Self-attention for Efficient Human Pose Estimation

no code implementations ICCV 2023 Yuran Sun, Alan William Dougherty, Zhuoying Zhang, Yi King Choi, Chuan Wu

Human pose estimation in videos has wide-ranging practical applications across various fields, many of which require fast inference on resource-scarce devices, necessitating the development of efficient and accurate algorithms.

3D Pose Estimation motion prediction +1

dPRO: A Generic Profiling and Optimization System for Expediting Distributed DNN Training

no code implementations5 May 2022 Hanpeng Hu, Chenyu Jiang, Yuchen Zhong, Yanghua Peng, Chuan Wu, Yibo Zhu, Haibin Lin, Chuanxiong Guo

Distributed training using multiple devices (e. g., GPUs) has been widely adopted for learning DNN models over large datasets.

BGL: GPU-Efficient GNN Training by Optimizing Graph Data I/O and Preprocessing

no code implementations16 Dec 2021 Tianfeng Liu, Yangrui Chen, Dan Li, Chuan Wu, Yibo Zhu, Jun He, Yanghua Peng, Hongzheng Chen, Hongzhi Chen, Chuanxiong Guo

Extensive experiments on various GNN models and large graph datasets show that BGL significantly outperforms existing GNN training systems by 20. 68x on average.

Graph Property Prediction Node Classification +1

Adversarial Deep Learning for Online Resource Allocation

no code implementations19 Nov 2021 Bingqian Du, Zhiyi Huang, Chuan Wu

Inspired by adversarial training from Generative Adversarial Net (GAN) and the fact that competitive ratio of an online algorithm is based on worst-case input, we adopt deep neural networks to learn an online algorithm for a resource allocation and pricing problem from scratch, with the goal that the performance gap between offline optimum and the learned online algorithm can be minimized for worst-case input.

Decision Making

OneFlow: Redesign the Distributed Deep Learning Framework from Scratch

1 code implementation28 Oct 2021 Jinhui Yuan, Xinqi Li, Cheng Cheng, Juncheng Liu, Ran Guo, Shenghang Cai, Chi Yao, Fei Yang, Xiaodong Yi, Chuan Wu, Haoran Zhang, Jie Zhao

Aiming at a simple, neat redesign of distributed deep learning frameworks for various parallelism paradigms, we present OneFlow, a novel distributed training framework based on an SBP (split, broadcast and partial-value) abstraction and the actor model.

On Locality in Graph Learning via Graph Neural Network

no code implementations29 Sep 2021 Junwei Su, Jiaqi Han, Chuan Wu

In this paper, we study how the training set in the input graph effects the performance of GNN.

Active Learning Graph Learning

WN-Salience: A Corpus of News Articles with Entity Salience Annotations

no code implementations LREC 2020 Chuan Wu, Evangelos Kanoulas, Maarten de Rijke, Wei Lu

To support research on entity salience, we present a new dataset, the WikiNews Salience dataset (WN-Salience), which can be used to benchmark tasks such as entity salience detection and salient entity linking.

Entity Linking

Distributed Machine Learning through Heterogeneous Edge Systems

no code implementations16 Nov 2019 Hanpeng Hu, Dan Wang, Chuan Wu

Many emerging AI applications request distributed machine learning (ML) among edge systems (e. g., IoT devices and PCs at the edge of the Internet), where data cannot be uploaded to a central venue for model training, due to their large volumes and/or security/privacy concerns.

BIG-bench Machine Learning

Characterizing Deep Learning Training Workloads on Alibaba-PAI

no code implementations14 Oct 2019 Mengdi Wang, Chen Meng, Guoping Long, Chuan Wu, Jun Yang, Wei. Lin, Yangqing Jia

One critical issue for efficiently operating practical AI clouds, is to characterize the computing and data transfer demands of these workloads, and more importantly, the training performance given the underlying software framework and hardware configurations.

DL2: A Deep Learning-driven Scheduler for Deep Learning Clusters

1 code implementation13 Sep 2019 Yanghua Peng, Yixin Bao, Yangrui Chen, Chuan Wu, Chen Meng, Wei. Lin

DL2 is a DL-driven scheduler for DL clusters, targeting global training job expedition by dynamically resizing resources allocated to jobs.

Fairness reinforcement-learning +2

Online Job Scheduling in Distributed Machine Learning Clusters

no code implementations3 Jan 2018 Yixin Bao, Yanghua Peng, Chuan Wu, Zongpeng Li

In a shared cluster handling multiple training jobs, a fundamental issue is how to efficiently schedule jobs and set the number of concurrent workers to run for each job, such that server resources are maximally utilized and model training can be completed in time.

Distributed, Parallel, and Cluster Computing

Normalized Direction-preserving Adam

1 code implementation ICLR 2018 Zijun Zhang, Lin Ma, Zongpeng Li, Chuan Wu

Adaptive optimization algorithms, such as Adam and RMSprop, have shown better optimization performance than stochastic gradient descent (SGD) in some scenarios.

General Classification

Online Influence Maximization in Non-Stationary Social Networks

1 code implementation26 Apr 2016 Yixin Bao, Xiaoke Wang, Zhi Wang, Chuan Wu, Francis C. M. Lau

Nevertheless, the existing studies mostly investigate the problem on a one-off basis, assuming fixed known influence probabilities among users, or the knowledge of the exact social network topology.

Marketing

Cannot find the paper you are looking for? You can Submit a new open access paper.