Search Results for author: Wu-Jun Li

Found 49 papers, 4 papers with code

InfLoRA: Interference-Free Low-Rank Adaptation for Continual Learning

1 code implementation30 Mar 2024 Yan-Shuo Liang, Wu-Jun Li

Furthermore, InfLoRA designs this subspace to eliminate the interference of the new task on the old tasks, making a good trade-off between stability and plasticity.

Continual Learning

UniAP: Unifying Inter- and Intra-Layer Automatic Parallelism by Mixed Integer Quadratic Programming

no code implementations31 Jul 2023 Hao Lin, Ke wu, Jie Li, Jun Li, Wu-Jun Li

To the best of our knowledge, UniAP is the first parallel method that can jointly optimize the two categories of parallel strategies to find an optimal solution.

On the Optimal Batch Size for Byzantine-Robust Distributed Learning

no code implementations23 May 2023 Yi-Rui Yang, Chang-Wei Shi, Wu-Jun Li

However, for existing BRDL methods, large batch sizes will lead to a drop on model accuracy, even if there is no Byzantine attack.

GNNFormer: A Graph-based Framework for Cytopathology Report Generation

no code implementations17 Mar 2023 Yang-Fan Zhou, Kai-Lang Yao, Wu-Jun Li

A common weakness of these works is that they do not explicitly model the structural information among cells, which is a key feature of pathology images and provides significant information for making diagnoses.

Caption Generation

FedREP: A Byzantine-Robust, Communication-Efficient and Privacy-Preserving Framework for Federated Learning

no code implementations9 Mar 2023 Yi-Rui Yang, Kun Wang, Wu-Jun Li

Based on ConSpar, we further propose a novel FL framework called FedREP, which is Byzantine-robust, communication-efficient and privacy-preserving.

Federated Learning Privacy Preserving

Asymmetric Learning for Graph Neural Network based Link Prediction

no code implementations1 Mar 2023 Kai-Lang Yao, Wu-Jun Li

To the best of our knowledge, AML is the first GNN-LP method adopting an asymmetric learning strategy for node representation learning.

Link Prediction Representation Learning

Adaptive Plasticity Improvement for Continual Learning

no code implementations CVPR 2023 Yan-Shuo Liang, Wu-Jun Li

Besides the ability to overcome CF on old tasks, API also tries to evaluate the model's plasticity and then adaptively improve the model's plasticity for learning a new task if necessary.

Continual Learning

Optimizing Class Distribution in Memory for Multi-Label Online Continual Learning

no code implementations23 Sep 2022 Yan-Shuo Liang, Wu-Jun Li

But class distribution in memory is critical for replay-based memory to get good performance, especially when the class distribution in data stream is highly imbalanced.

Continual Learning

State-based Episodic Memory for Multi-Agent Reinforcement Learning

no code implementations19 Oct 2021 Xiao Ma, Wu-Jun Li

SEM adopts episodic memory (EM) to supervise the centralized training procedure of CTDE in MARL.

reinforcement-learning Reinforcement Learning (RL) +2

Full-Precision Free Binary Graph Neural Networks

no code implementations29 Sep 2021 Kai-Lang Yao, Wu-Jun Li

Binary neural networks have become a promising research topic due to their fast inference speed and low energy consumption advantages.

Quantization

Optimizing Class Distribution in Memory for Multi-Label Continual Learning

no code implementations29 Sep 2021 Yan-Shuo Liang, Wu-Jun Li

Most existing replay-based methods focus on single-label problems in which each sample in the data stream has only one label.

Continual Learning

Multiple Code Hashing for Efficient Image Retrieval

no code implementations4 Aug 2020 Ming-Wei Li, Qing-Yuan Jiang, Wu-Jun Li

In this paper, we propose a novel hashing framework, called multiple code hashing (MCH), to improve the performance of hash bucket search.

Image Retrieval Retrieval

ExchNet: A Unified Hashing Network for Large-Scale Fine-Grained Image Retrieval

no code implementations ECCV 2020 Quan Cui, Qing-Yuan Jiang, Xiu-Shen Wei, Wu-Jun Li, Osamu Yoshie

Retrieving content relevant images from a large-scale fine-grained dataset could suffer from intolerably slow query speed and highly redundant storage cost, due to high-dimensional real-valued embeddings which aim to distinguish subtle visual differences of fine-grained objects.

Image Retrieval Retrieval

Stochastic Normalized Gradient Descent with Momentum for Large-Batch Training

no code implementations28 Jul 2020 Shen-Yi Zhao, Chang-Wei Shi, Yin-Peng Xie, Wu-Jun Li

Empirical results on deep learning verify that when adopting the same large batch size, SNGM can achieve better test accuracy than MSGD and other state-of-the-art large-batch training methods.

TOMA: Topological Map Abstraction for Reinforcement Learning

no code implementations11 May 2020 Zhao-Heng Yin, Wu-Jun Li

In particular, we propose planning to explore, in which TOMA is used to accelerate exploration by guiding the agent towards unexplored states.

Graph Generation reinforcement-learning +1

Buffered Asynchronous SGD for Byzantine Learning

no code implementations2 Mar 2020 Yi-Rui Yang, Wu-Jun Li

In this paper, we propose a novel method, called buffered asynchronous stochastic gradient descent (BASGD), for ABL.

Edge-computing Federated Learning

Stagewise Enlargement of Batch Size for SGD-based Learning

no code implementations26 Feb 2020 Shen-Yi Zhao, Yin-Peng Xie, Wu-Jun Li

We theoretically prove that, compared to classical stagewise SGD which decreases learning rate by stage, \mbox{SEBS} can reduce the number of parameter updates without increasing generalization error.

Weight Normalization based Quantization for Deep Neural Network Compression

no code implementations1 Jul 2019 Wen-Pu Cai, Wu-Jun Li

WNQ adopts weight normalization to avoid the long-tail distribution of network weights and subsequently reduces the quantization error.

Neural Network Compression Quantization

ADASS: Adaptive Sample Selection for Training Acceleration

no code implementations11 Jun 2019 Shen-Yi Zhao, Hao Gao, Wu-Jun Li

However, in all existing SGD and its variants, the sample size in each iteration~(epoch) of training is the same as the size of the full training set.

Clustered Reinforcement Learning

no code implementations6 Jun 2019 Xiao Ma, Shen-Yi Zhao, Wu-Jun Li

Exploration strategy design is one of the challenging problems in reinforcement learning~(RL), especially when the environment contains a large state space or sparse rewards.

Atari Games Clustering +4

Global Momentum Compression for Sparse Communication in Distributed Learning

no code implementations30 May 2019 Chang-Wei Shi, Shen-Yi Zhao, Yin-Peng Xie, Hao Gao, Wu-Jun Li

With the rapid growth of data, distributed momentum stochastic gradient descent~(DMSGD) has been widely used in distributed learning, especially for training large-scale deep models.

On the Convergence of Memory-Based Distributed SGD

no code implementations30 May 2019 Shen-Yi Zhao, Hao Gao, Wu-Jun Li

Using the transformation equation, we propose the convergence rate of stagewise M-DSGD which bridges the gap between theory and practice.

Collaborative Self-Attention for Recommender Systems

no code implementations27 May 2019 Kai-Lang Yao, Wu-Jun Li

Recommender systems (RS), which have been an essential part in a wide range of applications, can be formulated as a matrix completion (MC) problem.

Matrix Completion Recommendation Systems

Deep Multi-Index Hashing for Person Re-Identification

no code implementations27 May 2019 Ming-Wei Li, Qing-Yuan Jiang, Wu-Jun Li

In this paper, we propose a novel hashing method, called deep multi-index hashing (DMIH), to improve both efficiency and accuracy for ReID.

Person Re-Identification

On the Evaluation Metric for Hashing

no code implementations27 May 2019 Qing-Yuan Jiang, Ming-Wei Li, Wu-Jun Li

Bucket search, also called hash lookup, can achieve fast query speed with a sub-linear time cost based on the inverted index table constructed from hash codes.

Retrieval

Hashing based Answer Selection

no code implementations26 May 2019 Dong Xu, Wu-Jun Li

HAS adopts a hashing strategy to learn a binary matrix representation for each answer, which can dramatically reduce the memory cost for storing the matrix representations of answers.

Answer Selection

Gated Group Self-Attention for Answer Selection

no code implementations26 May 2019 Dong Xu, Jianhui Ji, Haikuan Huang, Hongbo Deng, Wu-Jun Li

Nevertheless, it is difficult for RNN based models to capture the information about long-range dependency among words in the sentences of questions and answers.

Answer Selection Machine Translation +1

Quantized Epoch-SGD for Communication-Efficient Distributed Learning

no code implementations10 Jan 2019 Shen-Yi Zhao, Hao Gao, Wu-Jun Li

Due to its efficiency and ease to implement, stochastic gradient descent (SGD) has been widely used in machine learning.

Quantization

Proximal SCOPE for Distributed Sparse Learning

no code implementations NeurIPS 2018 Shen-Yi Zhao, Gong-Duo Zhang, Ming-Wei Li, Wu-Jun Li

Based on the defined metric, we theoretically prove that pSCOPE is convergent with a linear convergence rate if the data partition is good enough.

Sparse Learning

Proximal SCOPE for Distributed Sparse Learning: Better Data Partition Implies Faster Convergence Rate

no code implementations15 Mar 2018 Shen-Yi Zhao, Gong-Duo Zhang, Ming-Wei Li, Wu-Jun Li

Based on the defined metric, we theoretically prove that pSCOPE is convergent with a linear convergence rate if the data partition is good enough.

Sparse Learning

Convolutional Geometric Matrix Completion

no code implementations2 Mar 2018 Kai-Lang Yao, Wu-Jun Li, Jianbo Yang, Xinyan Lu

Recently, geometric deep learning on graphs (GDLG) is proposed to solve the GMC problem, showing better performance than existing GMC methods including traditional graph regularization based methods.

Matrix Completion

Feature-Distributed SVRG for High-Dimensional Linear Classification

no code implementations10 Feb 2018 Gong-Duo Zhang, Shen-Yi Zhao, Hao Gao, Wu-Jun Li

Linear classification has been widely used in many high-dimensional applications like text classification.

General Classification text-classification +2

Asymmetric Deep Supervised Hashing

no code implementations26 Jul 2017 Qing-Yuan Jiang, Wu-Jun Li

However, most existing deep supervised hashing methods adopt a symmetric strategy to learn one deep hash function for both query points and database (retrieval) points.

Retrieval

Robust Frequent Directions with Application in Online Learning

no code implementations15 May 2017 Luo Luo, Cheng Chen, Zhihua Zhang, Wu-Jun Li, Tong Zhang

We also apply RFD to online learning and propose an effective hyperparameter-free online Newton algorithm.

ParaGraphE: A Library for Parallel Knowledge Graph Embedding

no code implementations16 Mar 2017 Xiao-Fan Niu, Wu-Jun Li

Knowledge graph embedding aims at translating the knowledge graph into numerical representations by transforming the entities and relations into continuous low-dimensional vectors.

Knowledge Graph Embedding Knowledge Graphs

Full-Time Supervision based Bidirectional RNN for Factoid Question Answering

no code implementations19 Jun 2016 Dong Xu, Wu-Jun Li

Hence, these existing models don't put supervision (loss or similarity calculation) at every time step, which will lose some useful information.

Question Answering

A Proximal Stochastic Quasi-Newton Algorithm

no code implementations31 Jan 2016 Luo Luo, Zihao Chen, Zhihua Zhang, Wu-Jun Li

It incorporates the Hessian in the smooth part of the function and exploits multistage scheme to reduce the variance of the stochastic gradient.

SCOPE: Scalable Composite Optimization for Learning on Spark

1 code implementation30 Jan 2016 Shen-Yi Zhao, Ru Xiang, Ying-Hao Shi, Peng Gao, Wu-Jun Li

Recently, many distributed stochastic optimization~(DSO) methods have been proposed to solve the large-scale composite optimization problems, which have shown better performance than traditional batch methods.

Stochastic Optimization

Feature Learning based Deep Supervised Hashing with Pairwise Labels

1 code implementation12 Nov 2015 Wu-Jun Li, Sheng Wang, Wang-Cheng Kang

For another common application scenario with pairwise labels, there have not existed methods for simultaneous feature learning and hash-code learning.

Deep Hashing Image Retrieval

A New Relaxation Approach to Normalized Hypergraph Cut

no code implementations9 Nov 2015 Cong Xie, Wu-Jun Li, Zhihua Zhang

Normalized graph cut (NGC) has become a popular research topic due to its wide applications in a large variety of areas like machine learning and very large scale integration (VLSI) circuit design.

Clustering

A Parallel algorithm for $\mathcal{X}$-Armed bandits

no code implementations26 Oct 2015 Cheng Chen, Shuang Liu, Zhihua Zhang, Wu-Jun Li

To deal with these large-scale data sets, we study a distributed setting of $\mathcal{X}$-armed bandits, where $m$ players collaborate to find the maximum of the unknown function.

Fast Asynchronous Parallel Stochastic Gradient Decent

no code implementations24 Aug 2015 Shen-Yi Zhao, Wu-Jun Li

Stochastic gradient descent~(SGD) and its variants have become more and more popular in machine learning due to their efficiency and effectiveness.

Scalable Stochastic Alternating Direction Method of Multipliers

no code implementations12 Feb 2015 Shen-Yi Zhao, Wu-Jun Li, Zhi-Hua Zhou

There exists only one stochastic method, called SA-ADMM, which can achieve convergence rate $O(1/T)$ on general convex problems.

Distributed Power-law Graph Computing: Theoretical and Empirical Analysis

no code implementations NeurIPS 2014 Cong Xie, Ling Yan, Wu-Jun Li, Zhihua Zhang

We theoretically prove that DBH can achieve lower communication cost than existing methods and can simultaneously guarantee good workload balance.

BIG-bench Machine Learning graph partitioning

Supervised hashing with latent factor models

no code implementations SIGIR 2014 Peichao Zhang, Wei zhang, Wu-Jun Li, and Minyi Guo

Very recently, supervised hashing methods, which try to preserve the semantic structure constructed from the semantic labels of the training points, have exhibited higher accuracy than unsupervised methods.

Isotropic Hashing

no code implementations NeurIPS 2012 Weihao Kong, Wu-Jun Li

Most existing hashing methods adopt some projection functions to project the original data into several dimensions of real values, and then each of these projected dimensions is quantized into one bit (zero or one) by thresholding.

Cannot find the paper you are looking for? You can Submit a new open access paper.