no code implementations • 13 Nov 2024 • Jin Han, Wu-Jun Li
Traditional alignment-based PSSS methods, which directly calculate alignment on the protein structures, are highly time-consuming with high memory cost.
no code implementations • 29 Jul 2024 • Jin Han, Yun Hong, Wu-Jun Li
In particular, DrugHash designs a simple yet effective hashing strategy to enable end-to-end learning of binary hash codes for both protein and molecule modalities, which can dramatically reduce the memory and time costs with higher accuracy compared with existing methods.
no code implementations • 27 Jul 2024 • Chang-Wei Shi, Yi-Rui Yang, Wu-Jun Li
We theoretically prove the convergence of OrMo with both constant and delay-adaptive learning rates for non-convex problems.
no code implementations • 5 Jun 2024 • Kun Wang, Yi-Rui Yang, Wu-Jun Li
Asynchronous federated learning (AFL) is an effective method to address the challenge of device heterogeneity in cross-device federated learning.
no code implementations • 31 May 2024 • Wen-Pu Cai, Wu-Jun Li
Weight quantization has been widely used for model compression, which can reduce both storage and computational cost.
no code implementations • 1 May 2024 • Huan-Yi Su, Ke wu, Yu-Hao Huang, Wu-Jun Li
One module is for adapting general-purpose LLMs to financial domain, and the other module is for enhancing the ability of NumLLM to understand financial text with numeric variables.
1 code implementation • CVPR 2024 • Yan-Shuo Liang, Wu-Jun Li
Furthermore, InfLoRA designs this subspace to eliminate the interference of the new task on the old tasks, making a good trade-off between stability and plasticity.
no code implementations • 31 Jul 2023 • Hao Lin, Ke wu, Jie Li, Jun Li, Wu-Jun Li
To the best of our knowledge, UniAP is the first parallel method that can jointly optimize the two categories of parallel strategies to find an optimal solution.
no code implementations • 23 May 2023 • Yi-Rui Yang, Chang-Wei Shi, Wu-Jun Li
However, for existing BRDL methods, large batch sizes will lead to a drop on model accuracy, even if there is no Byzantine attack.
no code implementations • 17 Mar 2023 • Yang-Fan Zhou, Kai-Lang Yao, Wu-Jun Li
A common weakness of these works is that they do not explicitly model the structural information among cells, which is a key feature of pathology images and provides significant information for making diagnoses.
no code implementations • 9 Mar 2023 • Yi-Rui Yang, Kun Wang, Wu-Jun Li
Based on ConSpar, we further propose a novel FL framework called FedREP, which is Byzantine-robust, communication-efficient and privacy-preserving.
no code implementations • 1 Mar 2023 • Kai-Lang Yao, Wu-Jun Li
To the best of our knowledge, AML is the first GNN-LP method adopting an asymmetric learning strategy for node representation learning.
no code implementations • CVPR 2023 • Yan-Shuo Liang, Wu-Jun Li
Besides the ability to overcome CF on old tasks, API also tries to evaluate the model's plasticity and then adaptively improve the model's plasticity for learning a new task if necessary.
no code implementations • 23 Sep 2022 • Yan-Shuo Liang, Wu-Jun Li
But class distribution in memory is critical for replay-based memory to get good performance, especially when the class distribution in data stream is highly imbalanced.
no code implementations • 19 Oct 2021 • Xiao Ma, Wu-Jun Li
SEM adopts episodic memory (EM) to supervise the centralized training procedure of CTDE in MARL.
no code implementations • 29 Sep 2021 • Kai-Lang Yao, Wu-Jun Li
Binary neural networks have become a promising research topic due to their fast inference speed and low energy consumption advantages.
no code implementations • 29 Sep 2021 • Yan-Shuo Liang, Wu-Jun Li
Most existing replay-based methods focus on single-label problems in which each sample in the data stream has only one label.
1 code implementation • 13 Sep 2021 • Zhao-Heng Yin, Lingfeng Sun, Hengbo Ma, Masayoshi Tomizuka, Wu-Jun Li
In this paper, we consider CDIL on a class of similar robots.
no code implementations • 4 Aug 2020 • Ming-Wei Li, Qing-Yuan Jiang, Wu-Jun Li
In this paper, we propose a novel hashing framework, called multiple code hashing (MCH), to improve the performance of hash bucket search.
no code implementations • ECCV 2020 • Quan Cui, Qing-Yuan Jiang, Xiu-Shen Wei, Wu-Jun Li, Osamu Yoshie
Retrieving content relevant images from a large-scale fine-grained dataset could suffer from intolerably slow query speed and highly redundant storage cost, due to high-dimensional real-valued embeddings which aim to distinguish subtle visual differences of fine-grained objects.
no code implementations • 28 Jul 2020 • Shen-Yi Zhao, Chang-Wei Shi, Yin-Peng Xie, Wu-Jun Li
Empirical results on deep learning verify that when adopting the same large batch size, SNGM can achieve better test accuracy than MSGD and other state-of-the-art large-batch training methods.
no code implementations • 11 May 2020 • Zhao-Heng Yin, Wu-Jun Li
In particular, we propose planning to explore, in which TOMA is used to accelerate exploration by guiding the agent towards unexplored states.
no code implementations • 2 Mar 2020 • Yi-Rui Yang, Wu-Jun Li
In this paper, we propose a novel method, called buffered asynchronous stochastic gradient descent (BASGD), for ABL.
no code implementations • 26 Feb 2020 • Shen-Yi Zhao, Yin-Peng Xie, Wu-Jun Li
We theoretically prove that, compared to classical stagewise SGD which decreases learning rate by stage, \mbox{SEBS} can reduce the number of parameter updates without increasing generalization error.
no code implementations • 1 Jul 2019 • Wen-Pu Cai, Wu-Jun Li
WNQ adopts weight normalization to avoid the long-tail distribution of network weights and subsequently reduces the quantization error.
no code implementations • 11 Jun 2019 • Shen-Yi Zhao, Hao Gao, Wu-Jun Li
However, in all existing SGD and its variants, the sample size in each iteration~(epoch) of training is the same as the size of the full training set.
no code implementations • 6 Jun 2019 • Xiao Ma, Shen-Yi Zhao, Wu-Jun Li
Exploration strategy design is one of the challenging problems in reinforcement learning~(RL), especially when the environment contains a large state space or sparse rewards.
no code implementations • 30 May 2019 • Shen-Yi Zhao, Hao Gao, Wu-Jun Li
Using the transformation equation, we propose the convergence rate of stagewise M-DSGD which bridges the gap between theory and practice.
no code implementations • 30 May 2019 • Chang-Wei Shi, Shen-Yi Zhao, Yin-Peng Xie, Hao Gao, Wu-Jun Li
With the rapid growth of data, distributed momentum stochastic gradient descent~(DMSGD) has been widely used in distributed learning, especially for training large-scale deep models.
no code implementations • 27 May 2019 • Ming-Wei Li, Qing-Yuan Jiang, Wu-Jun Li
In this paper, we propose a novel hashing method, called deep multi-index hashing (DMIH), to improve both efficiency and accuracy for ReID.
no code implementations • 27 May 2019 • Qing-Yuan Jiang, Ming-Wei Li, Wu-Jun Li
Bucket search, also called hash lookup, can achieve fast query speed with a sub-linear time cost based on the inverted index table constructed from hash codes.
no code implementations • 27 May 2019 • Kai-Lang Yao, Wu-Jun Li
Recommender systems (RS), which have been an essential part in a wide range of applications, can be formulated as a matrix completion (MC) problem.
no code implementations • 26 May 2019 • Dong Xu, Wu-Jun Li
HAS adopts a hashing strategy to learn a binary matrix representation for each answer, which can dramatically reduce the memory cost for storing the matrix representations of answers.
no code implementations • 26 May 2019 • Dong Xu, Jianhui Ji, Haikuan Huang, Hongbo Deng, Wu-Jun Li
Nevertheless, it is difficult for RNN based models to capture the information about long-range dependency among words in the sentences of questions and answers.
no code implementations • 10 Jan 2019 • Shen-Yi Zhao, Hao Gao, Wu-Jun Li
Due to its efficiency and ease to implement, stochastic gradient descent (SGD) has been widely used in machine learning.
no code implementations • NeurIPS 2018 • Shen-Yi Zhao, Gong-Duo Zhang, Ming-Wei Li, Wu-Jun Li
Based on the defined metric, we theoretically prove that pSCOPE is convergent with a linear convergence rate if the data partition is good enough.
no code implementations • 15 Mar 2018 • Shen-Yi Zhao, Gong-Duo Zhang, Ming-Wei Li, Wu-Jun Li
Based on the defined metric, we theoretically prove that pSCOPE is convergent with a linear convergence rate if the data partition is good enough.
no code implementations • 2 Mar 2018 • Kai-Lang Yao, Wu-Jun Li, Jianbo Yang, Xinyan Lu
Recently, geometric deep learning on graphs (GDLG) is proposed to solve the GMC problem, showing better performance than existing GMC methods including traditional graph regularization based methods.
no code implementations • 10 Feb 2018 • Gong-Duo Zhang, Shen-Yi Zhao, Hao Gao, Wu-Jun Li
Linear classification has been widely used in many high-dimensional applications like text classification.
no code implementations • 26 Jul 2017 • Qing-Yuan Jiang, Wu-Jun Li
However, most existing deep supervised hashing methods adopt a symmetric strategy to learn one deep hash function for both query points and database (retrieval) points.
no code implementations • 15 May 2017 • Luo Luo, Cheng Chen, Zhihua Zhang, Wu-Jun Li, Tong Zhang
We also apply RFD to online learning and propose an effective hyperparameter-free online Newton algorithm.
no code implementations • 16 Mar 2017 • Xiao-Fan Niu, Wu-Jun Li
Knowledge graph embedding aims at translating the knowledge graph into numerical representations by transforming the entities and relations into continuous low-dimensional vectors.
no code implementations • 11 Dec 2016 • Shen-Yi Zhao, Gong-Duo Zhang, Wu-Jun Li
and AsySVRG, for non-convex problems.
no code implementations • 19 Jun 2016 • Dong Xu, Wu-Jun Li
Hence, these existing models don't put supervision (loss or similarity calculation) at every time step, which will lose some useful information.
no code implementations • 31 Jan 2016 • Luo Luo, Zihao Chen, Zhihua Zhang, Wu-Jun Li
It incorporates the Hessian in the smooth part of the function and exploits multistage scheme to reduce the variance of the stochastic gradient.
1 code implementation • 30 Jan 2016 • Shen-Yi Zhao, Ru Xiang, Ying-Hao Shi, Peng Gao, Wu-Jun Li
Recently, many distributed stochastic optimization~(DSO) methods have been proposed to solve the large-scale composite optimization problems, which have shown better performance than traditional batch methods.
1 code implementation • 12 Nov 2015 • Wu-Jun Li, Sheng Wang, Wang-Cheng Kang
For another common application scenario with pairwise labels, there have not existed methods for simultaneous feature learning and hash-code learning.
no code implementations • 9 Nov 2015 • Cong Xie, Wu-Jun Li, Zhihua Zhang
Normalized graph cut (NGC) has become a popular research topic due to its wide applications in a large variety of areas like machine learning and very large scale integration (VLSI) circuit design.
no code implementations • 26 Oct 2015 • Cheng Chen, Shuang Liu, Zhihua Zhang, Wu-Jun Li
To deal with these large-scale data sets, we study a distributed setting of $\mathcal{X}$-armed bandits, where $m$ players collaborate to find the maximum of the unknown function.
no code implementations • 24 Aug 2015 • Shen-Yi Zhao, Wu-Jun Li
Stochastic gradient descent~(SGD) and its variants have become more and more popular in machine learning due to their efficiency and effectiveness.
no code implementations • 12 Feb 2015 • Shen-Yi Zhao, Wu-Jun Li, Zhi-Hua Zhou
There exists only one stochastic method, called SA-ADMM, which can achieve convergence rate $O(1/T)$ on general convex problems.
no code implementations • NeurIPS 2014 • Cong Xie, Ling Yan, Wu-Jun Li, Zhihua Zhang
We theoretically prove that DBH can achieve lower communication cost than existing methods and can simultaneously guarantee good workload balance.
no code implementations • SIGIR 2014 • Peichao Zhang, Wei zhang, Wu-Jun Li, and Minyi Guo
Very recently, supervised hashing methods, which try to preserve the semantic structure constructed from the semantic labels of the training points, have exhibited higher accuracy than unsupervised methods.
no code implementations • NeurIPS 2012 • Weihao Kong, Wu-Jun Li
Most existing hashing methods adopt some projection functions to project the original data into several dimensions of real values, and then each of these projected dimensions is quantized into one bit (zero or one) by thresholding.
no code implementations • NeurIPS 2009 • Wu-Jun Li, Dit-yan Yeung, Zhihua Zhang
assumption is unreasonable for relational data.