Search Results for author: Wu-Jun Li

Found 49 papers, 4 papers with code

InfLoRA: Interference-Free Low-Rank Adaptation for Continual Learning

1 code implementation • 30 Mar 2024 • Yan-Shuo Liang, Wu-Jun Li

Furthermore, InfLoRA designs this subspace to eliminate the interference of the new task on the old tasks, making a good trade-off between stability and plasticity.

Continual Learning

Paper
Code

UniAP: Unifying Inter- and Intra-Layer Automatic Parallelism by Mixed Integer Quadratic Programming

no code implementations • 31 Jul 2023 • Hao Lin, Ke wu, Jie Li, Jun Li, Wu-Jun Li

To the best of our knowledge, UniAP is the first parallel method that can jointly optimize the two categories of parallel strategies to find an optimal solution.

Paper
Add Code

On the Optimal Batch Size for Byzantine-Robust Distributed Learning

no code implementations • 23 May 2023 • Yi-Rui Yang, Chang-Wei Shi, Wu-Jun Li

However, for existing BRDL methods, large batch sizes will lead to a drop on model accuracy, even if there is no Byzantine attack.

Paper
Add Code

GNNFormer: A Graph-based Framework for Cytopathology Report Generation

no code implementations • 17 Mar 2023 • Yang-Fan Zhou, Kai-Lang Yao, Wu-Jun Li

A common weakness of these works is that they do not explicitly model the structural information among cells, which is a key feature of pathology images and provides significant information for making diagnoses.

Caption Generation

Paper
Add Code

FedREP: A Byzantine-Robust, Communication-Efficient and Privacy-Preserving Framework for Federated Learning

no code implementations • 9 Mar 2023 • Yi-Rui Yang, Kun Wang, Wu-Jun Li

Based on ConSpar, we further propose a novel FL framework called FedREP, which is Byzantine-robust, communication-efficient and privacy-preserving.

Federated Learning Privacy Preserving

Paper
Add Code

Asymmetric Learning for Graph Neural Network based Link Prediction

no code implementations • 1 Mar 2023 • Kai-Lang Yao, Wu-Jun Li

To the best of our knowledge, AML is the first GNN-LP method adopting an asymmetric learning strategy for node representation learning.

Link Prediction Representation Learning

Paper
Add Code

Adaptive Plasticity Improvement for Continual Learning

no code implementations • CVPR 2023 • Yan-Shuo Liang, Wu-Jun Li

Besides the ability to overcome CF on old tasks, API also tries to evaluate the model's plasticity and then adaptively improve the model's plasticity for learning a new task if necessary.

Continual Learning

Paper
Add Code

Optimizing Class Distribution in Memory for Multi-Label Online Continual Learning

no code implementations • 23 Sep 2022 • Yan-Shuo Liang, Wu-Jun Li

But class distribution in memory is critical for replay-based memory to get good performance, especially when the class distribution in data stream is highly imbalanced.

Continual Learning

Paper
Add Code

State-based Episodic Memory for Multi-Agent Reinforcement Learning

no code implementations • 19 Oct 2021 • Xiao Ma, Wu-Jun Li

SEM adopts episodic memory (EM) to supervise the centralized training procedure of CTDE in MARL.

reinforcement-learning Reinforcement Learning (RL) +2

Paper
Add Code

Full-Precision Free Binary Graph Neural Networks

no code implementations • 29 Sep 2021 • Kai-Lang Yao, Wu-Jun Li

Binary neural networks have become a promising research topic due to their fast inference speed and low energy consumption advantages.

Quantization

Paper
Add Code

Optimizing Class Distribution in Memory for Multi-Label Continual Learning

no code implementations • 29 Sep 2021 • Yan-Shuo Liang, Wu-Jun Li

Most existing replay-based methods focus on single-label problems in which each sample in the data stream has only one label.

Continual Learning

Paper
Add Code

Cross Domain Robot Imitation with Invariant Representation

1 code implementation • 13 Sep 2021 • Zhao-Heng Yin, Lingfeng Sun, Hengbo Ma, Masayoshi Tomizuka, Wu-Jun Li

In this paper, we consider CDIL on a class of similar robots.

Imitation Learning Representation Learning

Paper
Code

Multiple Code Hashing for Efficient Image Retrieval

no code implementations • 4 Aug 2020 • Ming-Wei Li, Qing-Yuan Jiang, Wu-Jun Li

In this paper, we propose a novel hashing framework, called multiple code hashing (MCH), to improve the performance of hash bucket search.

Image Retrieval Retrieval

Paper
Add Code

ExchNet: A Unified Hashing Network for Large-Scale Fine-Grained Image Retrieval

no code implementations • ECCV 2020 • Quan Cui, Qing-Yuan Jiang, Xiu-Shen Wei, Wu-Jun Li, Osamu Yoshie

Retrieving content relevant images from a large-scale fine-grained dataset could suffer from intolerably slow query speed and highly redundant storage cost, due to high-dimensional real-valued embeddings which aim to distinguish subtle visual differences of fine-grained objects.

Image Retrieval Retrieval

Paper
Add Code

Stochastic Normalized Gradient Descent with Momentum for Large-Batch Training

no code implementations • 28 Jul 2020 • Shen-Yi Zhao, Chang-Wei Shi, Yin-Peng Xie, Wu-Jun Li

Empirical results on deep learning verify that when adopting the same large batch size, SNGM can achieve better test accuracy than MSGD and other state-of-the-art large-batch training methods.

Paper
Add Code

TOMA: Topological Map Abstraction for Reinforcement Learning

no code implementations • 11 May 2020 • Zhao-Heng Yin, Wu-Jun Li

In particular, we propose planning to explore, in which TOMA is used to accelerate exploration by guiding the agent towards unexplored states.

Graph Generation reinforcement-learning +1

Paper
Add Code

Buffered Asynchronous SGD for Byzantine Learning

no code implementations • 2 Mar 2020 • Yi-Rui Yang, Wu-Jun Li

In this paper, we propose a novel method, called buffered asynchronous stochastic gradient descent (BASGD), for ABL.

Edge-computing Federated Learning

Paper
Add Code

Stagewise Enlargement of Batch Size for SGD-based Learning

no code implementations • 26 Feb 2020 • Shen-Yi Zhao, Yin-Peng Xie, Wu-Jun Li

We theoretically prove that, compared to classical stagewise SGD which decreases learning rate by stage, \mbox{SEBS} can reduce the number of parameter updates without increasing generalization error.

Paper
Add Code

Weight Normalization based Quantization for Deep Neural Network Compression

no code implementations • 1 Jul 2019 • Wen-Pu Cai, Wu-Jun Li

WNQ adopts weight normalization to avoid the long-tail distribution of network weights and subsequently reduces the quantization error.

Neural Network Compression Quantization

Paper
Add Code

ADASS: Adaptive Sample Selection for Training Acceleration

no code implementations • 11 Jun 2019 • Shen-Yi Zhao, Hao Gao, Wu-Jun Li

However, in all existing SGD and its variants, the sample size in each iteration~(epoch) of training is the same as the size of the full training set.

Paper
Add Code

Clustered Reinforcement Learning

no code implementations • 6 Jun 2019 • Xiao Ma, Shen-Yi Zhao, Wu-Jun Li

Exploration strategy design is one of the challenging problems in reinforcement learning~(RL), especially when the environment contains a large state space or sparse rewards.

Atari Games Clustering +4

Paper
Add Code

Global Momentum Compression for Sparse Communication in Distributed Learning

no code implementations • 30 May 2019 • Chang-Wei Shi, Shen-Yi Zhao, Yin-Peng Xie, Hao Gao, Wu-Jun Li

With the rapid growth of data, distributed momentum stochastic gradient descent~(DMSGD) has been widely used in distributed learning, especially for training large-scale deep models.

Paper
Add Code

On the Convergence of Memory-Based Distributed SGD

no code implementations • 30 May 2019 • Shen-Yi Zhao, Hao Gao, Wu-Jun Li

Using the transformation equation, we propose the convergence rate of stagewise M-DSGD which bridges the gap between theory and practice.

Paper
Add Code

Collaborative Self-Attention for Recommender Systems

no code implementations • 27 May 2019 • Kai-Lang Yao, Wu-Jun Li

Recommender systems (RS), which have been an essential part in a wide range of applications, can be formulated as a matrix completion (MC) problem.

Matrix Completion Recommendation Systems

Paper
Add Code

Deep Multi-Index Hashing for Person Re-Identification

no code implementations • 27 May 2019 • Ming-Wei Li, Qing-Yuan Jiang, Wu-Jun Li

In this paper, we propose a novel hashing method, called deep multi-index hashing (DMIH), to improve both efficiency and accuracy for ReID.

Person Re-Identification

Paper
Add Code

On the Evaluation Metric for Hashing

no code implementations • 27 May 2019 • Qing-Yuan Jiang, Ming-Wei Li, Wu-Jun Li

Bucket search, also called hash lookup, can achieve fast query speed with a sub-linear time cost based on the inverted index table constructed from hash codes.

Retrieval

Paper
Add Code

Hashing based Answer Selection

no code implementations • 26 May 2019 • Dong Xu, Wu-Jun Li

HAS adopts a hashing strategy to learn a binary matrix representation for each answer, which can dramatically reduce the memory cost for storing the matrix representations of answers.

Answer Selection

Paper
Add Code

Gated Group Self-Attention for Answer Selection

no code implementations • 26 May 2019 • Dong Xu, Jianhui Ji, Haikuan Huang, Hongbo Deng, Wu-Jun Li

Nevertheless, it is difficult for RNN based models to capture the information about long-range dependency among words in the sentences of questions and answers.

Answer Selection Machine Translation +1

Paper
Add Code

Quantized Epoch-SGD for Communication-Efficient Distributed Learning

no code implementations • 10 Jan 2019 • Shen-Yi Zhao, Hao Gao, Wu-Jun Li

Due to its efficiency and ease to implement, stochastic gradient descent (SGD) has been widely used in machine learning.

Quantization

Paper
Add Code

Proximal SCOPE for Distributed Sparse Learning

no code implementations • NeurIPS 2018 • Shen-Yi Zhao, Gong-Duo Zhang, Ming-Wei Li, Wu-Jun Li

Based on the defined metric, we theoretically prove that pSCOPE is convergent with a linear convergence rate if the data partition is good enough.

Sparse Learning

Paper
Add Code

Proximal SCOPE for Distributed Sparse Learning: Better Data Partition Implies Faster Convergence Rate

no code implementations • 15 Mar 2018 • Shen-Yi Zhao, Gong-Duo Zhang, Ming-Wei Li, Wu-Jun Li

Based on the defined metric, we theoretically prove that pSCOPE is convergent with a linear convergence rate if the data partition is good enough.

Sparse Learning

Paper
Add Code

Convolutional Geometric Matrix Completion

no code implementations • 2 Mar 2018 • Kai-Lang Yao, Wu-Jun Li, Jianbo Yang, Xinyan Lu

Recently, geometric deep learning on graphs (GDLG) is proposed to solve the GMC problem, showing better performance than existing GMC methods including traditional graph regularization based methods.

Matrix Completion

Paper
Add Code

Feature-Distributed SVRG for High-Dimensional Linear Classification

no code implementations • 10 Feb 2018 • Gong-Duo Zhang, Shen-Yi Zhao, Hao Gao, Wu-Jun Li

Linear classification has been widely used in many high-dimensional applications like text classification.

General Classification text-classification +2

Paper
Add Code

Asymmetric Deep Supervised Hashing

no code implementations • 26 Jul 2017 • Qing-Yuan Jiang, Wu-Jun Li

However, most existing deep supervised hashing methods adopt a symmetric strategy to learn one deep hash function for both query points and database (retrieval) points.

Retrieval

Paper
Add Code

Robust Frequent Directions with Application in Online Learning

no code implementations • 15 May 2017 • Luo Luo, Cheng Chen, Zhihua Zhang, Wu-Jun Li, Tong Zhang

We also apply RFD to online learning and propose an effective hyperparameter-free online Newton algorithm.

Paper
Add Code

ParaGraphE: A Library for Parallel Knowledge Graph Embedding

no code implementations • 16 Mar 2017 • Xiao-Fan Niu, Wu-Jun Li

Knowledge graph embedding aims at translating the knowledge graph into numerical representations by transforming the entities and relations into continuous low-dimensional vectors.

Knowledge Graph Embedding Knowledge Graphs

Paper
Add Code

Lock-Free Optimization for Non-Convex Problems

no code implementations • 11 Dec 2016 • Shen-Yi Zhao, Gong-Duo Zhang, Wu-Jun Li

and AsySVRG, for non-convex problems.

Paper
Add Code

Full-Time Supervision based Bidirectional RNN for Factoid Question Answering

no code implementations • 19 Jun 2016 • Dong Xu, Wu-Jun Li

Hence, these existing models don't put supervision (loss or similarity calculation) at every time step, which will lose some useful information.

Question Answering

Paper
Add Code

A Proximal Stochastic Quasi-Newton Algorithm

no code implementations • 31 Jan 2016 • Luo Luo, Zihao Chen, Zhihua Zhang, Wu-Jun Li

It incorporates the Hessian in the smooth part of the function and exploits multistage scheme to reduce the variance of the stochastic gradient.

Paper
Add Code

SCOPE: Scalable Composite Optimization for Learning on Spark

1 code implementation • 30 Jan 2016 • Shen-Yi Zhao, Ru Xiang, Ying-Hao Shi, Peng Gao, Wu-Jun Li

Recently, many distributed stochastic optimization~(DSO) methods have been proposed to solve the large-scale composite optimization problems, which have shown better performance than traditional batch methods.

Stochastic Optimization

155

Paper
Code

Feature Learning based Deep Supervised Hashing with Pairwise Labels

1 code implementation • 12 Nov 2015 • Wu-Jun Li, Sheng Wang, Wang-Cheng Kang

For another common application scenario with pairwise labels, there have not existed methods for simultaneous feature learning and hash-code learning.

Deep Hashing Image Retrieval

Paper
Code

A New Relaxation Approach to Normalized Hypergraph Cut

no code implementations • 9 Nov 2015 • Cong Xie, Wu-Jun Li, Zhihua Zhang

Normalized graph cut (NGC) has become a popular research topic due to its wide applications in a large variety of areas like machine learning and very large scale integration (VLSI) circuit design.

Clustering

Paper
Add Code

A Parallel algorithm for $\mathcal{X}$-Armed bandits

no code implementations • 26 Oct 2015 • Cheng Chen, Shuang Liu, Zhihua Zhang, Wu-Jun Li

To deal with these large-scale data sets, we study a distributed setting of $\mathcal{X}$-armed bandits, where $m$ players collaborate to find the maximum of the unknown function.

Paper
Add Code

Fast Asynchronous Parallel Stochastic Gradient Decent

no code implementations • 24 Aug 2015 • Shen-Yi Zhao, Wu-Jun Li

Stochastic gradient descent~(SGD) and its variants have become more and more popular in machine learning due to their efficiency and effectiveness.

Paper
Add Code

Scalable Stochastic Alternating Direction Method of Multipliers

no code implementations • 12 Feb 2015 • Shen-Yi Zhao, Wu-Jun Li, Zhi-Hua Zhou

There exists only one stochastic method, called SA-ADMM, which can achieve convergence rate $O(1/T)$ on general convex problems.

Paper
Add Code

Distributed Power-law Graph Computing: Theoretical and Empirical Analysis

no code implementations • NeurIPS 2014 • Cong Xie, Ling Yan, Wu-Jun Li, Zhihua Zhang

We theoretically prove that DBH can achieve lower communication cost than existing methods and can simultaneously guarantee good workload balance.

BIG-bench Machine Learning graph partitioning

Paper
Add Code

Supervised hashing with latent factor models

no code implementations • SIGIR 2014 • Peichao Zhang, Wei zhang, Wu-Jun Li, and Minyi Guo

Very recently, supervised hashing methods, which try to preserve the semantic structure constructed from the semantic labels of the training points, have exhibited higher accuracy than unsupervised methods.

Paper
Add Code

Isotropic Hashing

no code implementations • NeurIPS 2012 • Weihao Kong, Wu-Jun Li

Most existing hashing methods adopt some projection functions to project the original data into several dimensions of real values, and then each of these projected dimensions is quantized into one bit (zero or one) by thresholding.

Paper
Add Code

Probabilistic Relational PCA

no code implementations • NeurIPS 2009 • Wu-Jun Li, Dit-yan Yeung, Zhihua Zhang

assumption is unreasonable for relational data.

Dimensionality Reduction

Paper
Add Code

Cannot find the paper you are looking for? You can Submit a new open access paper.