Search Results for author: Yudong Li

Found 12 papers, 9 papers with code

Parameter-efficient Continual Learning Framework in Industrial Real-time Text Classification System

no code implementations NAACL (ACL) 2022 Tao Zhu, Zhe Zhao, Weijie Liu, Jiachi Liu, Yiren Chen, Weiquan Mao, Haoyan Liu, Kunbo Ding, Yudong Li, Xuefeng Yang

Catastrophic forgetting is a challenge for model deployment in industrial real-time systems, which requires the model to quickly master a new task without forgetting the old one.

Continual Learning text-classification +1

Dynamic data sampler for cross-language transfer learning in large language models

1 code implementation17 May 2024 Yudong Li, Yuhao Feng, Wen Zhou, Zhe Zhao, Linlin Shen, Cheng Hou, Xianxu Hou

Large Language Models (LLMs) have gained significant attention in the field of natural language processing (NLP) due to their wide range of applications.

Language Modelling Transfer Learning +1

Rethinking Residual Connection in Training Large-Scale Spiking Neural Networks

no code implementations9 Nov 2023 Yudong Li, Yunlin Lei, Xu Yang

Spiking Neural Network (SNN) is known as the most famous brain-inspired model, but the non-differentiable spiking mechanism makes it hard to train large-scale SNNs.

Existence and Completeness of Bounded Disturbance Observers: A Set-Membership Viewpoint

no code implementations6 Sep 2023 Yudong Li, Yirui Cong, Jiuxiang Dong

We also prove that the proposed DO has the capability to achieve the worst-case optimality, which can provide a benchmark for the design of DOs.

VisorGPT: Learning Visual Prior via Generative Pre-Training

1 code implementation23 May 2023 Jinheng Xie, Kai Ye, Yudong Li, Yuexiang Li, Kevin Qinghong Lin, Yefeng Zheng, Linlin Shen, Mike Zheng Shou

Experimental results demonstrate that VisorGPT can effectively model the visual prior, which can be employed for many vision tasks, such as customizing accurate human pose for conditional image synthesis models like ControlNet.

Image Generation Language Modelling +1

TencentPretrain: A Scalable and Flexible Toolkit for Pre-training Models of Different Modalities

3 code implementations13 Dec 2022 Zhe Zhao, Yudong Li, Cheng Hou, Jing Zhao, Rong Tian, Weijie Liu, Yiren Chen, Ningyuan Sun, Haoyan Liu, Weiquan Mao, Han Guo, Weigang Guo, Taiqiang Wu, Tao Zhu, Wenhang Shi, Chen Chen, Shan Huang, Sihong Chen, Liqun Liu, Feifei Li, Xiaoshuai Chen, Xingwu Sun, Zhanhui Kang, Xiaoyong Du, Linlin Shen, Kimmo Yan

The proposed pre-training models of different modalities are showing a rising trend of homogeneity in their model structures, which brings the opportunity to implement different pre-training models within a uniform framework.


Spikeformer: A Novel Architecture for Training High-Performance Low-Latency Spiking Neural Network

1 code implementation19 Nov 2022 Yudong Li, Yunlin Lei, Xu Yang

Spiking neural networks (SNNs) have made great progress on both performance and efficiency over the last few years, but their unique working pattern makes it hard to train a high-performance low-latency SNN. Thus the development of SNNs still lags behind traditional artificial neural networks (ANNs). To compensate this gap, many extraordinary works have been proposed. Nevertheless, these works are mainly based on the same kind of network structure (i. e. CNN) and their performance is worse than their ANN counterparts, which limits the applications of SNNs. To this end, we propose a novel Transformer-based SNN, termed "Spikeformer", which outperforms its ANN counterpart on both static dataset and neuromorphic dataset and may be an alternative architecture to CNN for training high-performance SNNs. First, to deal with the problem of "data hungry" and the unstable training period exhibited in the vanilla model, we design the Convolutional Tokenizer (CT) module, which improves the accuracy of the original model on DVS-Gesture by more than 16%. Besides, in order to better incorporate the attention mechanism inside Transformer and the spatio-temporal information inherent to SNN, we adopt spatio-temporal attention (STA) instead of spatial-wise or temporal-wise attention. With our proposed method, we achieve competitive or state-of-the-art (SOTA) SNN performance on DVS-CIFAR10, DVS-Gesture, and ImageNet datasets with the least simulation time steps (i. e. low latency). Remarkably, our Spikeformer outperforms other SNNs on ImageNet by a large margin (i. e. more than 5%) and even outperforms its ANN counterpart by 3. 1% and 2. 2% on DVS-Gesture and ImageNet respectively, indicating that Spikeformer is a promising architecture for training large-scale SNNs and may be more suitable for SNNs compared to CNN. We believe that this work shall keep the development of SNNs in step with ANNs as much as possible. Code will be available.

Tenrec: A Large-scale Multipurpose Benchmark Dataset for Recommender Systems

2 code implementations13 Oct 2022 Guanghu Yuan, Fajie Yuan, Yudong Li, Beibei Kong, Shujie Li, Lei Chen, Min Yang, Chenyun Yu, Bo Hu, Zang Li, Yu Xu, XiaoHu Qie

Existing benchmark datasets for recommender systems (RS) either are created at a small scale or involve very limited forms of user feedback.

Recommendation Systems

One Person, One Model, One World: Learning Continual User Representation without Forgetting

2 code implementations29 Sep 2020 Fajie Yuan, Guoxiao Zhang, Alexandros Karatzoglou, Joemon Jose, Beibei Kong, Yudong Li

In this paper, we delve on research to continually learn user representations task by task, whereby new tasks are learned while using partial parameters from old ones.

Recommendation Systems

Cannot find the paper you are looking for? You can Submit a new open access paper.