no code implementations • 24 May 2024 • Penghui Qi, Xinyi Wan, Nyamdavaa Amar, Min Lin
Our evaluations demonstrate that in pure pipeline parallelism settings, our methods outperform 1F1B by from 7% to 55% in terms of throughput.
2 code implementations • 4 Apr 2024 • Longxu Dou, Qian Liu, Guangtao Zeng, Jia Guo, Jiahui Zhou, Wei Lu, Min Lin
We present Sailor, a family of open language models ranging from 0. 5B to 7B parameters, tailored for South-East Asian (SEA) languages.
1 code implementation • 12 Mar 2024 • Tongyao Zhu, Qian Liu, Liang Pang, Zhengbao Jiang, Min-Yen Kan, Min Lin
Through carefully-designed synthetic tasks, covering the scenarios of full recitation, selective recitation and grounded question answering, we reveal that LMs manage to sequentially access their memory while encountering challenges in randomly accessing memorized content.
1 code implementation • 26 Feb 2024 • Yijing Liu, Chao Du, Tianyu Pang, Chongxuan Li, Wei Chen, Min Lin
Recent research has made significant progress in optimizing diffusion models for specific downstream objectives, which is an important pursuit in fields such as graph generation for drug design.
no code implementations • 19 Feb 2024 • Tianlin Li, Qian Liu, Tianyu Pang, Chao Du, Qing Guo, Yang Liu, Min Lin
The emerging success of large language models (LLMs) heavily relies on collecting abundant training data from external (untrusted) sources.
1 code implementation • 13 Feb 2024 • Xiangming Gu, Xiaosen Zheng, Tianyu Pang, Chao Du, Qian Liu, Ye Wang, Jing Jiang, Min Lin
A multimodal large language model (MLLM) agent can receive instructions, capture images, retrieve histories from memory, and decide which tools to use.
1 code implementation • 13 Feb 2024 • Dong Lu, Tianyu Pang, Chao Du, Qian Liu, Xianjun Yang, Min Lin
Backdoor attacks are commonly executed by contaminating training data, such that a trigger can activate predetermined harmful effects during the test phase.
no code implementations • 23 Jan 2024 • Zichen Liu, Chao Du, Wee Sun Lee, Min Lin
Unfortunately, NN-based models need re-training on all accumulated data at every interaction step to achieve FTL, which is computationally expensive for lifelong agents.
1 code implementation • 22 Jan 2024 • Jiawei Zhang, Tianyu Pang, Chao Du, Yi Ren, Bo Li, Min Lin
This technical report aims to fill a deficiency in the assessment of large multimodal models (LMMs) by specifically examining the self-consistency of their outputs when subjected to common corruptions.
1 code implementation • 30 Nov 2023 • Penghui Qi, Xinyi Wan, Guangxing Huang, Min Lin
Pipeline parallelism is one of the key components for large-scale distributed training, yet its efficiency suffers from pipeline bubbles which were deemed inevitable.
1 code implementation • 30 Nov 2023 • Min Lin
We present a set of primitive operators that serve as foundational building blocks for constructing several key types of functionals.
1 code implementation • 14 Nov 2023 • Ming Li, Pan Zhou, Jia-Wei Liu, Jussi Keppo, Min Lin, Shuicheng Yan, Xiangyu Xu
We achieve this remarkable speed by devising a new network that directly constructs a 3D triplane from a text prompt.
1 code implementation • 11 Nov 2023 • Xudong Shen, Chao Du, Tianyu Pang, Min Lin, Yongkang Wong, Mohan Kankanhalli
The rapid adoption of text-to-image diffusion models in society underscores an urgent need to address their biases.
1 code implementation • 1 Nov 2023 • Xiaosen Zheng, Tianyu Pang, Chao Du, Jing Jiang, Min Lin
Data attribution seeks to trace model outputs back to training data.
2 code implementations • 4 Oct 2023 • Xiangming Gu, Chao Du, Tianyu Pang, Chongxuan Li, Min Lin, Ye Wang
Looking into this, we first observe that memorization behaviors tend to occur on smaller-sized datasets, which motivates our definition of effective model memorization (EMM), a metric measuring the maximum size of training data at which a learned diffusion model approximates its theoretical optimum.
1 code implementation • 29 Sep 2023 • Shengyi Huang, Jiayi Weng, Rujikorn Charakorn, Min Lin, Zhongwen Xu, Santiago Ontañón
Distributed Deep Reinforcement Learning (DRL) aims to leverage more computational resources to train autonomous agents with less training time.
1 code implementation • 25 Jul 2023 • Chengsong Huang, Qian Liu, Bill Yuchen Lin, Tianyu Pang, Chao Du, Min Lin
This paper investigates LoRA composability for cross-task generalization and introduces LoraHub, a simple framework devised for the purposive assembly of LoRA modules trained on diverse given tasks, with the objective of achieving adaptable performance on unseen tasks.
1 code implementation • NeurIPS 2023 • Stefan Lionar, Xiangyu Xu, Min Lin, Gim Hee Lee
Second, our Repulsive UDF is a novel alternative to the occupancy field used in MCC, significantly improving the quality of 3D object reconstruction.
1 code implementation • NeurIPS 2023 • Yunqing Zhao, Tianyu Pang, Chao Du, Xiao Yang, Chongxuan Li, Ngai-Man Cheung, Min Lin
Large vision-language models (VLMs) such as GPT-4 have achieved unprecedented performance in response generation, especially with visual inputs, enabling more creative and adaptable interaction than large language models such as ChatGPT.
2 code implementations • 3 May 2023 • Chao Du, Tianbo Li, Tianyu Pang, Shuicheng Yan, Min Lin
Sliced-Wasserstein Flow (SWF) is a promising approach to nonparametric generative modeling but has not been widely adopted due to its suboptimal generative quality and lack of conditional modeling capabilities.
1 code implementation • 17 Apr 2023 • Qian Liu, Fan Zhou, Zhengbao Jiang, Longxu Dou, Min Lin
Empirical results on various benchmarks validate that the integration of SQL execution leads to significant improvements in zero-shot scenarios, particularly in table reasoning.
1 code implementation • CVPR 2023 • Yunqing Zhao, Chao Du, Milad Abdollahzadeh, Tianyu Pang, Min Lin, Shuicheng Yan, Ngai-Man Cheung
To this end, we propose knowledge truncation to mitigate this issue in FSIG, which is a complementary operation to knowledge preservation and is implemented by a lightweight pruning-based method.
1 code implementation • 17 Mar 2023 • Yunqing Zhao, Tianyu Pang, Chao Du, Xiao Yang, Ngai-Man Cheung, Min Lin
Diffusion models (DMs) have demonstrated advantageous potential on generative tasks.
no code implementations • 1 Mar 2023 • Tianbo Li, Min Lin, Zheyuan Hu, Kunhao Zheng, Giovanni Vignale, Kenji Kawaguchi, A. H. Castro Neto, Kostya S. Novoselov, Shuicheng Yan
Kohn-Sham Density Functional Theory (KS-DFT) has been traditionally solved by the Self-Consistent Field (SCF) method.
1 code implementation • 9 Feb 2023 • Weichen Yu, Tianyu Pang, Qian Liu, Chao Du, Bingyi Kang, Yan Huang, Min Lin, Shuicheng Yan
With the advance of language models, privacy protection is receiving more attention.
2 code implementations • 9 Feb 2023 • Zekai Wang, Tianyu Pang, Chao Du, Min Lin, Weiwei Liu, Shuicheng Yan
Under the $\ell_\infty$-norm threat model with $\epsilon=8/255$, our models achieve $70. 69\%$ and $42. 67\%$ robust accuracy on CIFAR-10 and CIFAR-100, respectively, i. e. improving upon previous state-of-the-art models by $+4. 58\%$ and $+8. 03\%$.
1 code implementation • 28 Jan 2023 • Haozhe Feng, Tianyu Pang, Chao Du, Wei Chen, Shuicheng Yan, Min Lin
BAFFLE is 1) memory-efficient and easily fits uploading bandwidth; 2) compatible with inference-only hardware optimization and model quantization or pruning; and 3) well-suited to trusted execution environments, because the clients in BAFFLE only execute forward propagation and return a set of scalars to the server.
no code implementations • ICCV 2023 • Yun Wang, Cheng Chi, Min Lin, Xin Yang
This approach circulates high-resolution estimated information (scene flow and feature) from the preceding iteration back to the low-resolution layer of the current iteration.
1 code implementation • NeurIPS 2023 • Xiao Ma, Bingyi Kang, Zhongwen Xu, Min Lin, Shuicheng Yan
In this work, we propose a novel MISA framework to approach offline RL from the perspective of Mutual Information between States and Actions in the dataset by directly constraining the policy improvement direction.
no code implementations • 26 Sep 2022 • Yun Zhao, Hang Chen, Min Lin, Haiou Zhang, Tao Yan, Xing Lin, Ruqi Huang, Qionghai Dai
Increasing the layer number of on-chip photonic neural networks (PNNs) is essential to improve its model performance.
3 code implementations • 21 Jun 2022 • Jiayi Weng, Min Lin, Shengyi Huang, Bo Liu, Denys Makoviichuk, Viktor Makoviychuk, Zichen Liu, Yufan Song, Ting Luo, Yukun Jiang, Zhongwen Xu, Shuicheng Yan
EnvPool is open-sourced at https://github. com/sail-sg/envpool.
no code implementations • 26 May 2022 • Tianyu Pang, Shuicheng Yan, Min Lin
In this paper, we substitute the Slater determinant with a pairwise antisymmetry construction, which is easy to implement and can reduce the computational cost to $O(N^2)$.
no code implementations • COLING 2022 • Ziqing Yang, Zihang Xu, Yiming Cui, Baoxin Wang, Min Lin, Dayong Wu, Zhigang Chen
It covers Standard Chinese, Yue Chinese, and six other ethnic minority languages.
1 code implementation • 21 Feb 2022 • Tianyu Pang, Min Lin, Xiao Yang, Jun Zhu, Shuicheng Yan
The trade-off between robustness and accuracy has been widely studied in the adversarial literature.
1 code implementation • 30 Dec 2021 • Yongduo Sui, Xiang Wang, Jiancan Wu, Min Lin, Xiangnan He, Tat-Seng Chua
To endow the classifier with better interpretation and generalization, we propose the Causal Attention Learning (CAL) strategy, which discovers the causal patterns and mitigates the confounding effect of shortcuts.
1 code implementation • NeurIPS 2021 • Xinhsuai Dong, Luu Anh Tuan, Min Lin, Shuicheng Yan, Hanwang Zhang
The fine-tuning of pre-trained language models has a great success in many NLP fields.
1 code implementation • 27 Oct 2021 • Kun Li, Meng Li, Yanling Li, Min Lin
The traditional trend prediction models can better predict the short trend than the long trend.
no code implementations • 13 May 2021 • Bai Zhao, Min Lin, Ming Cheng, Wei-Ping Zhu, Naofal Al-Dhahir
This paper proposes a robust beamforming scheme to enhance the physical layer security (PLS) of multicast transmission in a cognitive satellite and aerial network (CSAN) operating in the millimeter wave frequency band.
no code implementations • NeurIPS 2020 • Massimo Caccia, Pau Rodriguez, Oleksiy Ostapenko, Fabrice Normandin, Min Lin, Lucas Page-Caccia, Issam Hadj Laradji, Irina Rish, Alexandre Lacoste, David Vázquez, Laurent Charlin
The main challenge is that the agent must not forget previous tasks and also adapt to novel tasks in the stream.
no code implementations • ICML Workshop LifelongML 2020 • Xu He, Min Lin
We compare these approaches in terms of both compression and forgetting and empirically study the reasons that limit the performance of continual learning methods based on variational posterior approximation.
1 code implementation • NeurIPS 2020 • Massimo Caccia, Pau Rodriguez, Oleksiy Ostapenko, Fabrice Normandin, Min Lin, Lucas Caccia, Issam Laradji, Irina Rish, Alexandre Lacoste, David Vazquez, Laurent Charlin
We propose Continual-MAML, an online extension of the popular MAML algorithm as a strong baseline for this scenario.
2 code implementations • NeurIPS 2019 • Rahaf Aljundi, Eugene Belilovsky, Tinne Tuytelaars, Laurent Charlin, Massimo Caccia, Min Lin, Lucas Page-Caccia
Methods based on replay, either generative or from a stored memory, have been shown to be effective approaches for continual learning, matching or exceeding the state of the art in a number of standard benchmarks.
1 code implementation • 11 Aug 2019 • Rahaf Aljundi, Lucas Caccia, Eugene Belilovsky, Massimo Caccia, Min Lin, Laurent Charlin, Tinne Tuytelaars
Methods based on replay, either generative or from a stored memory, have been shown to be effective approaches for continual learning, matching or exceeding the state of the art in a number of standard benchmarks.
no code implementations • 16 Jun 2019 • Min Lin, Jie Fu, Yoshua Bengio
In this study, we analyze parameter sharing under the conditional computation framework where the parameters of a neural network are conditioned on each input example.
4 code implementations • NeurIPS 2019 • Rahaf Aljundi, Min Lin, Baptiste Goujaud, Yoshua Bengio
To prevent forgetting, a replay buffer is usually employed to store the previous data for the purpose of rehearsal.
2 code implementations • ICLR 2019 • Nasim Rahaman, Aristide Baratin, Devansh Arpit, Felix Draxler, Min Lin, Fred A. Hamprecht, Yoshua Bengio, Aaron Courville
Neural networks are known to be a class of highly expressive functions able to fit even random input-output mappings with $100\%$ accuracy.
no code implementations • 16 Dec 2017 • Jun-Bo Wang, Junyuan Wang, Yongpeng Wu, Jin-Yuan Wang, Huiling Zhu, Min Lin, Jiangzhou Wang
Moreover, optimal or near-optimal solutions of historical scenarios can be searched offline and stored in advance.
4 code implementations • 20 Apr 2017 • Min Lin
In the generator training phase, the target is to assign equal probability to all data points in the batch, each with probability $\frac{1}{M+N}$.
2 code implementations • 3 Dec 2015 • Tianqi Chen, Mu Li, Yutian Li, Min Lin, Naiyan Wang, Minjie Wang, Tianjun Xiao, Bing Xu, Chiyuan Zhang, Zheng Zhang
This paper describes both the API design and the system implementation of MXNet, and explains how embedding of both symbolic expression and tensor operation is handled in a unified fashion.
no code implementations • 18 Jan 2015 • Canyi Lu, Jinhui Tang, Min Lin, Liang Lin, Shuicheng Yan, Zhouchen Lin
In this paper, we study the robust subspace clustering problem, which aims to cluster the given possibly noisy data points into their underlying subspaces.
1 code implementation • 19 Dec 2014 • Min Lin, Shuo Li, Xuan Luo, Shuicheng Yan
In this paper, we introduce a novel deep learning framework, termed Purine.
18 code implementations • 16 Dec 2013 • Min Lin, Qiang Chen, Shuicheng Yan
With enhanced local modeling via the micro network, we are able to utilize global average pooling over feature maps in the classification layer, which is easier to interpret and less prone to overfitting than traditional fully connected layers.
Ranked #4 on Face Identification on DroneSURF