Search Results for author: James Cheng

Found 57 papers, 23 papers with code

Discovery of the Hidden World with Large Language Models

no code implementations6 Feb 2024 Chenxi Liu, Yongqiang Chen, Tongliang Liu, Mingming Gong, James Cheng, Bo Han, Kun Zhang

The rise of large language models (LLMs) that are trained to learn rich knowledge from the massive observations of the world, provides a new opportunity to assist with discovering high-level hidden variables from the raw observational data.

Causal Discovery

Enhancing Neural Subset Selection: Integrating Background Information into Set Representations

no code implementations5 Feb 2024 Binghui Xie, Yatao Bian, Kaiwen Zhou, Yongqiang Chen, Peilin Zhao, Bo Han, Wei Meng, James Cheng

Learning neural subset selection tasks, such as compound selection in AI-aided drug discovery, have become increasingly pivotal across diverse applications.

Drug Discovery

Enhancing Evolving Domain Generalization through Dynamic Latent Representations

no code implementations16 Jan 2024 Binghui Xie, Yongqiang Chen, Jiaqi Wang, Kaiwen Zhou, Bo Han, Wei Meng, James Cheng

However, in non-stationary tasks where new domains evolve in an underlying continuous structure, such as time, merely extracting the invariant features is insufficient for generalization to the evolving new domains.

Evolving Domain Generalization

SPT: Fine-Tuning Transformer-based Language Models Efficiently with Sparsification

1 code implementation16 Dec 2023 Yuntao Gui, Xiao Yan, Peiqi Yin, Han Yang, James Cheng

Thus, we design the sparse MHA module, which computes and stores only large attention weights to reduce memory consumption, and the routed FFN module, which dynamically activates a subset of model parameters for each token to reduce computation cost.

Quantization

Positional Information Matters for Invariant In-Context Learning: A Case Study of Simple Function Classes

no code implementations30 Nov 2023 Yongqiang Chen, Binghui Xie, Kaiwen Zhou, Bo Han, Yatao Bian, James Cheng

Surprisingly, DeepSet outperforms transformers across a variety of distribution shifts, implying that preserving permutation invariance symmetry to input demonstrations is crucial for OOD ICL.

In-Context Learning

Towards out-of-distribution generalizable predictions of chemical kinetics properties

1 code implementation4 Oct 2023 ZiHao Wang, Yongqiang Chen, Yang Duan, Weijiang Li, Bo Han, James Cheng, Hanghang Tong

Under this framework, we create comprehensive datasets to benchmark (1) the state-of-the-art ML approaches for reaction prediction in the OOD setting and (2) the state-of-the-art graph OOD methods in kinetics property prediction problems.

Property Prediction

Understanding and Improving Feature Learning for Out-of-Distribution Generalization

1 code implementation NeurIPS 2023 Yongqiang Chen, Wei Huang, Kaiwen Zhou, Yatao Bian, Bo Han, James Cheng

Moreover, when fed the ERM learned features to the OOD objectives, the invariant feature learning quality significantly affects the final OOD performance, as OOD objectives rarely learn new features.

Out-of-Distribution Generalization

Follower Agnostic Methods for Stackelberg Games

no code implementations2 Feb 2023 Chinmay Maheshwari, James Cheng, S. Shankar Sasty, Lillian Ratliff, Eric Mazumdar

In this paper, we present an efficient algorithm to solve online Stackelberg games, featuring multiple followers, in a follower-agnostic manner.

DGI: Easy and Efficient Inference for GNNs

no code implementations28 Nov 2022 Peiqi Yin, Xiao Yan, Jinjing Zhou, Qiang Fu, Zhenkun Cai, James Cheng, Bo Tang, Minjie Wang

In this paper, we develop Deep Graph Inference (DGI) -- a system for easy and efficient GNN model inference, which automatically translates the training code of a GNN model for layer-wise execution.

A Representation Learning Framework for Property Graphs

1 code implementation Proceedings of the 25th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining 2019 Yifan Hou, Hongzhi Chen, Changji Li, James Cheng, Ming-Chang Yang

Representation learning on graphs, also called graph embedding, has demonstrated its significant impact on a series of machine learning applications such as classification, prediction and recommendation.

Graph Embedding Graph Representation Learning +3

Efficient Private SCO for Heavy-Tailed Data via Clipping

no code implementations27 Jun 2022 Chenhan Jin, Kaiwen Zhou, Bo Han, Ming-Chang Yang, James Cheng

In this paper, we resolve this issue and derive the first high-probability bounds for the private stochastic method with clipping.

Fast and Reliable Evaluation of Adversarial Robustness with Minimum-Margin Attack

1 code implementation15 Jun 2022 Ruize Gao, Jiongxiao Wang, Kaiwen Zhou, Feng Liu, Binghui Xie, Gang Niu, Bo Han, James Cheng

The AutoAttack (AA) has been the most reliable method to evaluate adversarial robustness when considerable computational resources are available.

Adversarial Robustness Computational Efficiency

An Adaptive Incremental Gradient Method With Support for Non-Euclidean Norms

no code implementations28 Apr 2022 Binghui Xie, Chenhan Jin, Kaiwen Zhou, James Cheng, Wei Meng

Stochastic variance reduced methods have shown strong performance in solving finite-sum problems.

Understanding and Improving Graph Injection Attack by Promoting Unnoticeability

1 code implementation ICLR 2022 Yongqiang Chen, Han Yang, Yonggang Zhang, Kaili Ma, Tongliang Liu, Bo Han, James Cheng

Recently Graph Injection Attack (GIA) emerges as a practical attack scenario on Graph Neural Networks (GNNs), where the adversary can merely inject few malicious nodes instead of modifying existing nodes or edges, i. e., Graph Modification Attack (GMA).

Learning Causally Invariant Representations for Out-of-Distribution Generalization on Graphs

3 code implementations11 Feb 2022 Yongqiang Chen, Yonggang Zhang, Yatao Bian, Han Yang, Kaili Ma, Binghui Xie, Tongliang Liu, Bo Han, James Cheng

Despite recent success in using the invariance principle for out-of-distribution (OOD) generalization on Euclidean data (e. g., images), studies on graph data are still limited.

Drug Discovery Graph Learning +1

Accelerating Perturbed Stochastic Iterates in Asynchronous Lock-Free Optimization

no code implementations30 Sep 2021 Kaiwen Zhou, Anthony Man-Cho So, James Cheng

We show that stochastic acceleration can be achieved under the perturbed iterate framework (Mania et al., 2017) in asynchronous lock-free optimization, which leads to the optimal incremental gradient complexity for finite-sum objectives.

Local Reweighting for Adversarial Training

no code implementations30 Jun 2021 Ruize Gao, Feng Liu, Kaiwen Zhou, Gang Niu, Bo Han, James Cheng

However, when tested on attacks different from the given attack simulated in training, the robustness may drop significantly (e. g., even worse than no reweighting).

Practical Schemes for Finding Near-Stationary Points of Convex Finite-Sums

no code implementations NeurIPS 2021 Kaiwen Zhou, Lai Tian, Anthony Man-Cho So, James Cheng

In convex optimization, the problem of finding near-stationary points has not been adequately studied yet, unlike other optimality measures such as the function value.

DGCL: an efficient communication library for distributed GNN training

1 code implementation Proceedings of the Sixteenth European Conference on Computer Systems 2021 Zhenkun Cai, Xiao Yan, Yidi Wu, Kaihao Ma, James Cheng, Fan Yu

Graph neural networks (GNNs) have gained increasing popularity in many areas such as e-commerce, social networks and bio-informatics.

Elastic Deep Learning in Multi-Tenant GPU Clusters

no code implementations IEEE Transactions on Parallel and Distributed Systems 2021 Yidi Wu, Kaihao Ma, Xiao Yan, Zhi Liu, Zhenkun Cai, Yuzhen Huang, James Cheng, Han Yuan, Fan Yu

We study how to support elasticity, that is, the ability to dynamically adjust the parallelism (i. e., the number of GPUs), for deep neural network (DNN) training in a GPU cluster.

Management Scheduling

Calibrating and Improving Graph Contrastive Learning

1 code implementation27 Jan 2021 Kaili Ma, Haochen Yang, Han Yang, Yongqiang Chen, James Cheng

To assess the discrepancy between the prediction and the ground-truth in the downstream tasks for these contrastive pairs, we adapt the expected calibration error (ECE) to graph contrastive learning.

Contrastive Learning Graph Clustering +3

The item selection problem for user cold-start recommendation

no code implementations27 Oct 2020 Yitong Meng, Jie Liu, Xiao Yan, James Cheng

When a new user just signs up on a website, we usually have no information about him/her, i. e. no interaction with items, no user profile and no social links with other users.

Recommendation Systems

Rethinking Graph Regularization for Graph Neural Networks

1 code implementation4 Sep 2020 Han Yang, Kaili Ma, James Cheng

The graph Laplacian regularization term is usually used in semi-supervised representation learning to provide graph structure information for a model $f(X)$.

Node Classification Representation Learning

Understanding Graph Neural Networks from Graph Signal Denoising Perspectives

1 code implementation8 Jun 2020 Guoji Fu, Yifan Hou, Jian Zhang, Kaili Ma, Barakeel Fanseu Kamhoua, James Cheng

This paper aims to provide a theoretical framework to understand GNNs, specifically, spectral graph convolutional networks and graph attention networks, from graph signal denoising perspectives.

Denoising Graph Attention +2

Boosting First-Order Methods by Shifting Objective: New Schemes with Faster Worst-Case Rates

1 code implementation NeurIPS 2020 Kaiwen Zhou, Anthony Man-Cho So, James Cheng

Specifically, instead of tackling the original objective directly, we construct a shifted objective function that has the same minimizer as the original objective and encodes both the smoothness and strong convexity of the original objective in an interpolation condition.

TensorOpt: Exploring the Tradeoffs in Distributed DNN Training with Auto-Parallelism

1 code implementation16 Apr 2020 Zhenkun Cai, Kaihao Ma, Xiao Yan, Yidi Wu, Yuzhen Huang, James Cheng, Teng Su, Fan Yu

A good parallelization strategy can significantly improve the efficiency or reduce the cost for the distributed training of deep neural networks (DNNs).

Self-Enhanced GNN: Improving Graph Neural Networks Using Model Outputs

1 code implementation18 Feb 2020 Han Yang, Xiao Yan, Xinyan Dai, Yongqiang Chen, James Cheng

In this paper, we propose self-enhanced GNN (SEG), which improves the quality of the input data using the outputs of existing GNN models for better performance on semi-supervised node classification.

General Classification Node Classification

Convolutional Embedding for Edit Distance

2 code implementations31 Jan 2020 Xinyan Dai, Xiao Yan, Kaiwen Zhou, Yuxuan Wang, Han Yang, James Cheng

Edit-distance-based string similarity search has many applications such as spell correction, data de-duplication, and sequence alignment.

Norm-Explicit Quantization: Improving Vector Quantization for Maximum Inner Product Search

2 code implementations12 Nov 2019 Xinyan Dai, Xiao Yan, Kelvin K. W. Ng, Jie Liu, James Cheng

In this paper, we present a new angle to analyze the quantization error, which decomposes the quantization error into norm error and direction error.

Data Compression Quantization

Hyper-Sphere Quantization: Communication-Efficient SGD for Federated Learning

1 code implementation12 Nov 2019 Xinyan Dai, Xiao Yan, Kaiwen Zhou, Han Yang, Kelvin K. W. Ng, James Cheng, Yu Fan

In particular, at the high compression ratio end, HSQ provides a low per-iteration communication cost of $O(\log d)$, which is favorable for federated learning.

Federated Learning Quantization

Understanding and Improving Proximity Graph based Maximum Inner Product Search

no code implementations30 Sep 2019 Jie Liu, Xiao Yan, Xinyan Dai, Zhirong Li, James Cheng, Ming-Chang Yang

Then we explain the good performance of ip-NSW as matching the norm bias of the MIPS problem - large norm items have big in-degrees in the ip-NSW proximity graph and a walk on the graph spends the majority of computation on these items, thus effectively avoids unnecessary computation on small norm items.

Amortized Nesterov's Momentum: Robust and Lightweight Momentum for Deep Learning

no code implementations25 Sep 2019 Kaiwen Zhou, Yanghua Jin, Qinghua Ding, James Cheng

Stochastic Gradient Descent (SGD) with Nesterov's momentum is a widely used optimizer in deep learning, which is observed to have excellent generalization performance.

PMD: An Optimal Transportation-based User Distance for Recommender Systems

no code implementations10 Sep 2019 Yitong Meng, Xinyan Dai, Xiao Yan, James Cheng, Weiwen Liu, Benben Liao, Jun Guo, Guangyong Chen

Collaborative filtering, a widely-used recommendation technique, predicts a user's preference by aggregating the ratings from similar users.

Collaborative Filtering Recommendation Systems

Norm-Range Partition: A Universal Catalyst for LSH based Maximum Inner Product Search (MIPS)

1 code implementation22 Oct 2018 Xiao Yan, Xinyan Dai, Jie Liu, Kaiwen Zhou, James Cheng

Recently, locality sensitive hashing (LSH) was shown to be effective for MIPS and several algorithms including $L_2$-ALSH, Sign-ALSH and Simple-LSH have been proposed.

Bilinear Factor Matrix Norm Minimization for Robust PCA: Algorithms and Applications

no code implementations11 Oct 2018 Fanhua Shang, James Cheng, Yuanyuan Liu, Zhi-Quan Luo, Zhouchen Lin

The heavy-tailed distributions of corrupted outliers and singular values of all channels in low-level vision have proven effective priors for many applications such as background modeling, photometric stereo and image alignment.

Moving Object Detection object-detection

ASVRG: Accelerated Proximal SVRG

no code implementations7 Oct 2018 Fanhua Shang, Licheng Jiao, Kaiwen Zhou, James Cheng, Yan Ren, Yufei Jin

This paper proposes an accelerated proximal stochastic variance reduced gradient (ASVRG) method, in which we design a simple and effective momentum acceleration trick.

Norm-Ranging LSH for Maximum Inner Product Search

1 code implementation NeurIPS 2018 Xiao Yan, Jinfeng Li, Xinyan Dai, Hongzhi Chen, James Cheng

Neyshabur and Srebro proposed Simple-LSH, which is the state-of-the-art hashing method for maximum inner product search (MIPS) with performance guarantee.

A Simple Stochastic Variance Reduced Algorithm with Fast Convergence Rates

no code implementations ICML 2018 Kaiwen Zhou, Fanhua Shang, James Cheng

Recent years have witnessed exciting progress in the study of stochastic variance reduced gradient methods (e. g., SVRG, SAGA), their accelerated variants (e. g, Katyusha) and their extensions in many different settings (e. g., online, sparse, asynchronous, distributed).

Tractable and Scalable Schatten Quasi-Norm Approximations for Rank Minimization

no code implementations28 Feb 2018 Fanhua Shang, Yuanyuan Liu, James Cheng

The Schatten quasi-norm was introduced to bridge the gap between the trace norm and rank function.

Guaranteed Sufficient Decrease for Stochastic Variance Reduced Gradient Optimization

no code implementations26 Feb 2018 Fanhua Shang, Yuanyuan Liu, Kaiwen Zhou, James Cheng, Kelvin K. W. Ng, Yuichi Yoshida

In order to make sufficient decrease for stochastic optimization, we design a new sufficient decrease criterion, which yields sufficient decrease versions of stochastic variance reduction algorithms such as SVRG-SD and SAGA-SD as a byproduct.

Stochastic Optimization

VR-SGD: A Simple Stochastic Variance Reduction Method for Machine Learning

1 code implementation26 Feb 2018 Fanhua Shang, Kaiwen Zhou, Hongying Liu, James Cheng, Ivor W. Tsang, Lijun Zhang, DaCheng Tao, Licheng Jiao

In this paper, we propose a simple variant of the original SVRG, called variance reduced stochastic gradient descent (VR-SGD).

BIG-bench Machine Learning

Accelerated First-order Methods for Geodesically Convex Optimization on Riemannian Manifolds

no code implementations NeurIPS 2017 Yuanyuan Liu, Fanhua Shang, James Cheng, Hong Cheng, Licheng Jiao

In this paper, we propose an accelerated first-order method for geodesically convex optimization, which is the generalization of the standard Nesterov's accelerated method from Euclidean space to nonlinear Riemannian space.

Accelerated Variance Reduced Stochastic ADMM

no code implementations11 Jul 2017 Yuanyuan Liu, Fanhua Shang, James Cheng

Besides having a low per-iteration complexity as existing stochastic ADMM methods, ASVRG-ADMM improves the convergence rate on general convex problems from O(1/T) to O(1/T^2).

Fast Stochastic Variance Reduced Gradient Method with Momentum Acceleration for Machine Learning

no code implementations23 Mar 2017 Fanhua Shang, Yuanyuan Liu, James Cheng, Jiacheng Zhuo

Recently, research on accelerated stochastic gradient descent methods (e. g., SVRG) has made exciting progress (e. g., linear convergence for strongly convex problems).

BIG-bench Machine Learning regression

Guaranteed Sufficient Decrease for Variance Reduced Stochastic Gradient Descent

no code implementations20 Mar 2017 Fanhua Shang, Yuanyuan Liu, James Cheng, Kelvin Kai Wing Ng, Yuichi Yoshida

In order to make sufficient decrease for stochastic optimization, we design a new sufficient decrease criterion, which yields sufficient decrease versions of variance reduction algorithms such as SVRG-SD and SAGA-SD as a byproduct.

Stochastic Optimization

Scalable Algorithms for Tractable Schatten Quasi-Norm Minimization

no code implementations4 Jun 2016 Fanhua Shang, Yuanyuan Liu, James Cheng

In this paper, we first define two tractable Schatten quasi-norms, i. e., the Frobenius/nuclear hybrid and bi-nuclear quasi-norms, and then prove that they are in essence the Schatten-2/3 and 1/2 quasi-norms, respectively, which lead to the design of very efficient algorithms that only need to update two much smaller factor matrices.

Matrix Completion

Unified Scalable Equivalent Formulations for Schatten Quasi-Norms

no code implementations2 Jun 2016 Fanhua Shang, Yuanyuan Liu, James Cheng

In this paper, we rigorously prove that for any p, p1, p2>0 satisfying 1/p=1/p1+1/p2, the Schatten-p quasi-norm of any matrix is equivalent to minimizing the product of the Schatten-p1 norm (or quasi-norm) and Schatten-p2 norm (or quasi-norm) of its two factor matrices.

Regularized Orthogonal Tensor Decompositions for Multi-Relational Learning

no code implementations26 Dec 2015 Fanhua Shang, James Cheng, Hong Cheng

We first induce the equivalence relation of the Schatten p-norm (0<p<\infty) of a low multi-linear rank tensor and its core tensor.

Relational Reasoning

Generalized Higher-Order Orthogonal Iteration for Tensor Decomposition and Completion

no code implementations NeurIPS 2014 Yuanyuan Liu, Fanhua Shang, Wei Fan, James Cheng, Hong Cheng

Then the Schatten 1-norm of the core tensor is used to replace that of the whole tensor, which leads to a much smaller-scale matrix SNM problem.

Tensor Decomposition

Structured Low-Rank Matrix Factorization with Missing and Grossly Corrupted Observations

no code implementations3 Sep 2014 Fanhua Shang, Yuanyuan Liu, Hanghang Tong, James Cheng, Hong Cheng

In this paper, we propose a scalable, provable structured low-rank matrix factorization method to recover low-rank and sparse matrices from missing and grossly corrupted data, i. e., robust matrix completion (RMC) problems, or incomplete and grossly corrupted measurements, i. e., compressive principal component pursuit (CPCP) problems.

Matrix Completion

Generalized Higher-Order Tensor Decomposition via Parallel ADMM

no code implementations5 Jul 2014 Fanhua Shang, Yuanyuan Liu, James Cheng

To address these problems, we first propose a parallel trace norm regularized tensor decomposition method, and formulate it as a convex optimization problem.

Computational Efficiency Tensor Decomposition

Tripartite Graph Clustering for Dynamic Sentiment Analysis on Social Media

no code implementations24 Feb 2014 Linhong Zhu, Aram Galstyan, James Cheng, Kristina Lerman

We further investigate the evolution of user-level sentiments and latent feature vectors in an online framework and devise an efficient online algorithm to sequentially update the clustering of tweets, users and features with newly arrived data.

Clustering Graph Clustering +1

Truss Decomposition in Massive Networks

4 code implementations30 May 2012 Jia Wang, James Cheng

We first improve the existing in-memory algorithm for computing k-truss in networks of moderate size.

Databases

Cannot find the paper you are looking for? You can Submit a new open access paper.