1 code implementation • ECCV 2020 • Chaojian Li, Tianlong Chen, Haoran You, Zhangyang Wang, Yingyan Lin
There has been an explosive demand for bringing machine learning (ML) powered intelligence into numerous Internet-of-Things (IoT) devices.
no code implementations • 23 May 2024 • Guibin Zhang, Xiangguo Sun, Yanwei Yue, Kun Wang, Tianlong Chen, Shirui Pan
Specifically, MoG incorporates multiple sparsifier experts, each characterized by unique sparsity levels and pruning criteria, and selects the appropriate experts for each node.
1 code implementation • 8 May 2024 • Dawei Li, Shu Yang, Zhen Tan, Jae Young Baik, Sukwon Yun, Joseph Lee, Aaron Chacko, BoJian Hou, Duy Duong-Tran, Ying Ding, Huan Liu, Li Shen, Tianlong Chen
With a synergized framework of LLM and KG mutually enhancing each other, we first leverage LLM to construct an evolving AD-specific knowledge graph (KG) sourced from AD-related scientific literature, and then we utilize a coarse-to-fine sampling method with a novel self-aware knowledge retrieval approach to select appropriate knowledge from the KG to augment LLM inference capabilities.
1 code implementation • 30 Apr 2024 • Pingzhi Li, Junyu Liu, Hanrui Wang, Tianlong Chen
Nevertheless, one of its major bottlenecks is matrix inversion, which is notably time-consuming in $O(N^3)$ time with weak scalability.
no code implementations • 7 Apr 2024 • YiFan Li, Anh Dao, Wentao Bao, Zhen Tan, Tianlong Chen, Huan Liu, Yu Kong
Our initiative on the dataset and benchmarks reveal the nature and rationale of facial affective behaviors, i. e., fine-grained facial movement, interpretability, and reasoning.
no code implementations • 5 Apr 2024 • Ajay Jaiswal, Bodun Hu, Lu Yin, Yeonju Ro, Shiwei Liu, Tianlong Chen, Aditya Akella
In this work, we observed the saturation of computationally expensive feed-forward blocks of LLM layers and proposed FFN-SkipLLM, which is a novel fine-grained skip strategy of autoregressive LLMs.
1 code implementation • 3 Apr 2024 • Shwai He, Tianlong Chen
Moreover, while parameter-efficient LoRA finetuning has been proposed to repair the performance of sparse models, a significant challenge of weights merging arises due to the incompatibility of dense LoRA modules with sparse models that destroy the sparsity of pruned models.
no code implementations • 30 Mar 2024 • Taishi Nakamura, Mayank Mishra, Simone Tedeschi, Yekun Chai, Jason T Stillerman, Felix Friedrich, Prateek Yadav, Tanmay Laud, Vu Minh Chien, Terry Yue Zhuo, Diganta Misra, Ben Bogin, Xuan-Son Vu, Marzena Karpinska, Arnav Varma Dantuluri, Wojciech Kusa, Tommaso Furlanello, Rio Yokota, Niklas Muennighoff, Suhas Pai, Tosin Adewumi, Veronika Laippala, Xiaozhe Yao, Adalberto Junior, Alpay Ariyak, Aleksandr Drozd, Jordan Clive, Kshitij Gupta, Liangyu Chen, Qi Sun, Ken Tsui, Noah Persaud, Nour Fahmy, Tianlong Chen, Mohit Bansal, Nicolo Monti, Tai Dang, Ziyang Luo, Tien-Tung Bui, Roberto Navigli, Virendra Mehta, Matthew Blumberg, Victor May, Huu Nguyen, Sampo Pyysalo
Pretrained language models underpin several AI applications, but their high computational cost for training limits accessibility.
no code implementations • 11 Mar 2024 • Chi-Yang Hsu, Kyle Cox, Jiawei Xu, Zhen Tan, Tianhua Zhai, Mengzhou Hu, Dexter Pratt, Tianlong Chen, Ziniu Hu, Ying Ding
We present the Thought Graph as a novel framework to support complex reasoning and use gene set analysis as an example to uncover semantic relationships between biological processes.
no code implementations • 8 Mar 2024 • Zhen Tan, Jie Peng, Tianlong Chen, Huan Liu
Large Language Models (LLMs) have catalyzed transformative advances across a spectrum of natural language processing tasks through few-shot or zero-shot prompting, bypassing the need for parameter tuning.
no code implementations • 7 Mar 2024 • Tiejin Chen, Longchao Da, Huixue Zhou, Pingzhi Li, Kaixiong Zhou, Tianlong Chen, Hua Wei
The privacy concerns associated with the use of Large Language Models (LLMs) have grown recently with the development of LLMs such as ChatGPT.
no code implementations • 2 Mar 2024 • Song Wang, Zhen Tan, Xinyu Zhao, Tianlong Chen, Huan Liu, Jundong Li
In contrast, in this work, we propose a novel self-conditioned graph generation framework designed to explicitly model graph distributions and employ these distributions to guide the generation process.
1 code implementation • 22 Feb 2024 • Lichi Li, Zainul Abi Din, Zhen Tan, Sam London, Tianlong Chen, Ajay Daptardar
In the evolving e-commerce field, recommendation systems crucially shape user experience and engagement.
1 code implementation • 22 Feb 2024 • Xuxi Chen, Zhendong Wang, Daouda Sow, Junjie Yang, Tianlong Chen, Yingbin Liang, Mingyuan Zhou, Zhangyang Wang
Our study starts from an empirical strategy for the light continual training of LLMs using their original pre-training data sets, with a specific focus on selective retention of samples that incur moderately high losses.
no code implementations • 22 Feb 2024 • Zhiyuan Wang, Jinhao Duan, Chenxi Yuan, Qingyu Chen, Tianlong Chen, Huaxiu Yao, Yue Zhang, Ren Wang, Kaidi Xu, Xiaoshuang Shi
Uncertainty estimation plays a pivotal role in ensuring the reliability of safety-critical human-AI interaction systems, particularly in the medical domain.
1 code implementation • 20 Feb 2024 • Zhen Tan, Chengshuai Zhao, Raha Moraffah, YiFan Li, Yu Kong, Tianlong Chen, Huan Liu
Unlike direct harmful output generation for MLLMs, our research demonstrates how a single MLLM agent can be subtly influenced to generate prompts that, in turn, induce other MLLM agents in the society to output malicious content.
1 code implementation • 19 Feb 2024 • Jinhao Duan, Renming Zhang, James Diffenderfer, Bhavya Kailkhura, Lichao Sun, Elias Stengel-Eskin, Mohit Bansal, Tianlong Chen, Kaidi Xu
As Large Language Models (LLMs) are integrated into critical real-world applications, their strategic and logical reasoning abilities are increasingly crucial.
1 code implementation • 18 Feb 2024 • Yihua Zhang, Pingzhi Li, Junyuan Hong, Jiaxiang Li, Yimeng Zhang, Wenqing Zheng, Pin-Yu Chen, Jason D. Lee, Wotao Yin, Mingyi Hong, Zhangyang Wang, Sijia Liu, Tianlong Chen
In the evolving landscape of natural language processing (NLP), fine-tuning pre-trained Large Language Models (LLMs) with first-order (FO) optimizers like SGD and Adam has become standard.
no code implementations • 2 Feb 2024 • Guibin Zhang, Yanwei Yue, Kun Wang, Junfeng Fang, Yongduo Sui, Kai Wang, Yuxuan Liang, Dawei Cheng, Shirui Pan, Tianlong Chen
Specifically, GST initially constructs a topology & semantic anchor at a low training cost, followed by performing dynamic sparse training to align the sparse graph with the anchor.
1 code implementation • 28 Jan 2024 • Dawei Li, Zhen Tan, Tianlong Chen, Huan Liu
While textual information significantly enhances the performance of pre-trained language models (PLMs) in knowledge graph completion (KGC), the static and noisy nature of existing corpora collected from Wikipedia articles or synsets definitions often limits the potential of PLM-based KGC models.
1 code implementation • 10 Jan 2024 • Tianlong Chen, Zhenyu Zhang, Hanrui Wang, Jiaqi Gu, Zirui Li, David Z. Pan, Frederic T. Chong, Song Han, Zhangyang Wang
To address these two pain points, we propose QuantumSEA, an in-time sparse exploration for noise-adaptive quantum circuits, aiming to achieve two key objectives: (1) implicit circuits capacity during training - by dynamically exploring the circuit's sparse connectivity and sticking a fixed small number of quantum gates throughout the training which satisfies the coherence time and enjoy light noises, enabling feasible executions on real quantum devices; (2) noise robustness - by jointly optimizing the topology and parameters of quantum circuits under real device noise models.
1 code implementation • 10 Jan 2024 • Lichao Sun, Yue Huang, Haoran Wang, Siyuan Wu, Qihui Zhang, Yuan Li, Chujie Gao, Yixin Huang, Wenhan Lyu, Yixuan Zhang, Xiner Li, Zhengliang Liu, Yixin Liu, Yijue Wang, Zhikun Zhang, Bertie Vidgen, Bhavya Kailkhura, Caiming Xiong, Chaowei Xiao, Chunyuan Li, Eric Xing, Furong Huang, Hao liu, Heng Ji, Hongyi Wang, huan zhang, Huaxiu Yao, Manolis Kellis, Marinka Zitnik, Meng Jiang, Mohit Bansal, James Zou, Jian Pei, Jian Liu, Jianfeng Gao, Jiawei Han, Jieyu Zhao, Jiliang Tang, Jindong Wang, Joaquin Vanschoren, John Mitchell, Kai Shu, Kaidi Xu, Kai-Wei Chang, Lifang He, Lifu Huang, Michael Backes, Neil Zhenqiang Gong, Philip S. Yu, Pin-Yu Chen, Quanquan Gu, ran Xu, Rex Ying, Shuiwang Ji, Suman Jana, Tianlong Chen, Tianming Liu, Tianyi Zhou, William Wang, Xiang Li, Xiangliang Zhang, Xiao Wang, Xing Xie, Xun Chen, Xuyu Wang, Yan Liu, Yanfang Ye, Yinzhi Cao, Yong Chen, Yue Zhao
This paper introduces TrustLLM, a comprehensive study of trustworthiness in LLMs, including principles for different dimensions of trustworthiness, established benchmark, evaluation, and analysis of trustworthiness for mainstream LLMs, and discussion of open challenges and future directions.
2 code implementations • 22 Dec 2023 • Zhen Tan, Tianlong Chen, Zhenyu Zhang, Huan Liu
Large Language Models (LLMs) have achieved unprecedented breakthroughs in various natural language processing domains.
no code implementations • 9 Dec 2023 • Tianjin Huang, Tianlong Chen, Zhangyang Wang, Shiwei Liu
Therefore, it remains unclear whether the self-attention operation is crucial for the recent advances in SSL - or CNNs can deliver the same excellence with more advanced designs, too?
1 code implementation • 3 Dec 2023 • Can Jin, Tianjin Huang, Yihua Zhang, Mykola Pechenizkiy, Sijia Liu, Shiwei Liu, Tianlong Chen
The rapid development of large-scale deep learning models questions the affordability of hardware platforms, which necessitates the pruning to reduce their computational and memory footprints.
1 code implementation • 3 Dec 2023 • Junjie Yang, Tianlong Chen, Xuxi Chen, Zhangyang Wang, Yingbin Liang
Based on that, we further propose a new raw gradient descent (RGD) algorithm that eliminates the use of sign.
1 code implementation • 27 Nov 2023 • Yushi Huang, Ruihao Gong, Jing Liu, Tianlong Chen, Xianglong Liu
Remarkably, our quantization approach, for the first time, achieves model performance nearly on par with the full-precision model under 4-bit weight quantization.
no code implementations • 15 Nov 2023 • Yun Zhu, Nevan Wichers, Chu-Cheng Lin, Xinyi Wang, Tianlong Chen, Lei Shu, Han Lu, Canoee Liu, Liangchen Luo, Jindong Chen, Lei Meng
Parameter Efficient Tuning has been an prominent approach to adapt the Large Language Model to downstream tasks.
1 code implementation • 2 Oct 2023 • Pingzhi Li, Zhenyu Zhang, Prateek Yadav, Yi-Lin Sung, Yu Cheng, Mohit Bansal, Tianlong Chen
Sparsely activated Mixture-of-Experts (SMoE) has shown promise to scale up the learning capacity of neural networks, however, they have issues like (a) High Memory Usage, due to duplication of the network layers into multiple copies as experts; and (b) Redundancy in Experts, as common learning-based routing policies suffer from representational collapse.
1 code implementation • ICCV 2023 • Wenyan Cong, Hanxue Liang, Peihao Wang, Zhiwen Fan, Tianlong Chen, Mukund Varma, Yi Wang, Zhangyang Wang
Cross-scene generalizable NeRF models, which can directly synthesize novel views of unseen scenes, have become a new spotlight of the NeRF field.
1 code implementation • ICCV 2023 • Yihua Zhang, Ruisi Cai, Tianlong Chen, Guanhua Zhang, huan zhang, Pin-Yu Chen, Shiyu Chang, Zhangyang Wang, Sijia Liu
Since the lack of robustness has become one of the main hurdles for CNNs, in this paper we ask: How to adversarially robustify a CNN-based MoE model?
1 code implementation • 25 Jun 2023 • Tianjin Huang, Shiwei Liu, Tianlong Chen, Meng Fang, Li Shen, Vlaod Menkovski, Lu Yin, Yulong Pei, Mykola Pechenizkiy
Despite the fact that adversarial training has become the de facto method for improving the robustness of deep neural networks, it is well-known that vanilla adversarial training suffers from daunting robust overfitting, resulting in unsatisfactory robust generalization.
1 code implementation • 24 Jun 2023 • Zhenyu Zhang, Ying Sheng, Tianyi Zhou, Tianlong Chen, Lianmin Zheng, Ruisi Cai, Zhao Song, Yuandong Tian, Christopher Ré, Clark Barrett, Zhangyang Wang, Beidi Chen
Based on these insights, we propose Heavy Hitter Oracle (H$_2$O), a KV cache eviction policy that dynamically retains a balance of recent and H$_2$ tokens.
1 code implementation • 18 Jun 2023 • Ajay Jaiswal, Shiwei Liu, Tianlong Chen, Ying Ding, Zhangyang Wang
Motivated by the recent observations of model soups, which suggest that fine-tuned weights of multiple models can be merged to a better minima, we propose Instant Soup Pruning (ISP) to generate lottery ticket quality subnetworks, using a fraction of the original IMP cost by replacing the expensive intermediate pruning stages of IMP with computationally efficient weak mask generation and aggregation routine.
1 code implementation • 18 Jun 2023 • Ajay Jaiswal, Shiwei Liu, Tianlong Chen, Ying Ding, Zhangyang Wang
By dividing giant graph data, we build multiple independently and parallelly trained weaker GNNs (soup ingredient) without any intermediate communication, and combine their strength using a greedy interpolation soup procedure to achieve state-of-the-art performance.
1 code implementation • 3 Mar 2023 • Shiwei Liu, Tianlong Chen, Zhenyu Zhang, Xuxi Chen, Tianjin Huang, Ajay Jaiswal, Zhangyang Wang
In pursuit of a more general evaluation and unveiling the true potential of sparse algorithms, we introduce "Sparsity May Cry" Benchmark (SMC-Bench), a collection of carefully-curated 4 diverse tasks with 10 datasets, that accounts for capturing a wide range of domain-specific and sophisticated knowledge.
1 code implementation • 2 Mar 2023 • Tianlong Chen, Zhenyu Zhang, Ajay Jaiswal, Shiwei Liu, Zhangyang Wang
Despite their remarkable achievement, gigantic transformers encounter significant drawbacks, including exorbitant computational and memory footprints during training, as well as severe collapse evidenced by a high degree of parameter redundancy.
1 code implementation • 28 Feb 2023 • Junjie Yang, Xuxi Chen, Tianlong Chen, Zhangyang Wang, Yingbin Liang
This data-driven procedure yields L2O that can efficiently solve problems similar to those seen in training, that is, drawn from the same ``task distribution".
1 code implementation • 22 Feb 2023 • Junjie Yang, Tianlong Chen, Mingkang Zhu, Fengxiang He, DaCheng Tao, Yingbin Liang, Zhangyang Wang
While the optimizer generalization has been recently studied, the optimizee generalization (or learning to generalize) has not been rigorously studied in the L2O context, which is the aim of this paper.
1 code implementation • ICCV 2023 • Tianlong Chen, Xuxi Chen, Xianzhi Du, Abdullah Rashwan, Fan Yang, Huizhong Chen, Zhangyang Wang, Yeqing Li
Instead of compressing multiple tasks' knowledge into a single model, MoE separates the parameter space and only utilizes the relevant model pieces given task type and its input, which provides stabilized MTL training and ultra-efficient inference.
no code implementations • 6 Dec 2022 • Ajay Jaiswal, Tianlong Chen, Justin F. Rousseau, Yifan Peng, Ying Ding, Zhangyang Wang
However, DNNs are notoriously fragile to the class imbalance in image classification.
1 code implementation • 28 Nov 2022 • Tianjin Huang, Tianlong Chen, Meng Fang, Vlado Menkovski, Jiaxu Zhao, Lu Yin, Yulong Pei, Decebal Constantin Mocanu, Zhangyang Wang, Mykola Pechenizkiy, Shiwei Liu
Recent works have impressively demonstrated that there exists a subnetwork in randomly initialized convolutional neural networks (CNNs) that can match the performance of the fully trained dense networks at initialization, without any optimization of the weights of the network (i. e., untrained networks).
1 code implementation • 19 Nov 2022 • Zhenglun Kong, Haoyu Ma, Geng Yuan, Mengshu Sun, Yanyue Xie, Peiyan Dong, Xin Meng, Xuan Shen, Hao Tang, Minghai Qin, Tianlong Chen, Xiaolong Ma, Xiaohui Xie, Zhangyang Wang, Yanzhi Wang
Vision transformers (ViTs) have recently obtained success in many applications, but their intensive computation and heavy memory usage at both training and inference time limit their generalization.
no code implementations • 9 Nov 2022 • Kaixiong Zhou, Zhenyu Zhang, Shengyuan Chen, Tianlong Chen, Xiao Huang, Zhangyang Wang, Xia Hu
Quantum neural networks (QNNs), an interdisciplinary field of quantum computing and machine learning, have attracted tremendous research interests due to the specific quantum advantages.
1 code implementation • NIPS 2022 • Mukund Varma T, Xuxi Chen, Zhenyu Zhang, Tianlong Chen, Subhashini Venugopalan, Zhangyang Wang
Improving the performance of deep networks in data-limited regimes has warranted much attention.
1 code implementation • 26 Oct 2022 • Hanxue Liang, Zhiwen Fan, Rishov Sarkar, Ziyu Jiang, Tianlong Chen, Kai Zou, Yu Cheng, Cong Hao, Zhangyang Wang
However, when deploying MTL onto those real-world systems that are often resource-constrained or latency-sensitive, two prominent challenges arise: (i) during training, simultaneously optimizing all tasks is often difficult due to gradient conflicts across tasks; (ii) at inference, current MTL regimes have to activate nearly the entire model even to just execute a single task.
1 code implementation • 14 Oct 2022 • Ajay Jaiswal, Peihao Wang, Tianlong Chen, Justin F. Rousseau, Ying Ding, Zhangyang Wang
In this paper, firstly, we provide a new perspective of gradient flow to understand the substandard performance of deep GCNs and hypothesize that by facilitating healthy gradient flow, we can significantly improve their trainability, as well as achieve state-of-the-art (SOTA) level performance from vanilla-GCNs.
2 code implementations • 14 Oct 2022 • Keyu Duan, Zirui Liu, Peihao Wang, Wenqing Zheng, Kaixiong Zhou, Tianlong Chen, Xia Hu, Zhangyang Wang
Large-scale graph training is a notoriously challenging problem for graph neural networks (GNNs).
Ranked #2 on Node Property Prediction on ogbn-products
1 code implementation • 8 Oct 2022 • Yihua Zhang, Yuguang Yao, Parikshit Ram, Pu Zhao, Tianlong Chen, Mingyi Hong, Yanzhi Wang, Sijia Liu
To reduce the computation overhead, various efficient 'one-shot' pruning methods have been developed, but these schemes are usually unable to find winning tickets as good as IMP.
1 code implementation • 7 Oct 2022 • Tianxin Wei, Yuning You, Tianlong Chen, Yang shen, Jingrui He, Zhangyang Wang
This paper targets at improving the generalizability of hypergraph neural networks in the low-label regime, through applying the contrastive learning approach from images/graphs (we refer to it as HyperGCL).
2 code implementations • 15 Sep 2022 • Yi Wang, Zhiwen Fan, Tianlong Chen, Hehe Fan, Zhangyang Wang
Vision Transformers (ViTs) have proven to be effective, in solving 2D image understanding tasks by training over large-scale image datasets; and meanwhile as a somehow separate track, in modeling the 3D visual world too such as voxels or point clouds.
1 code implementation • 27 Jul 2022 • Mukund Varma T, Peihao Wang, Xuxi Chen, Tianlong Chen, Subhashini Venugopalan, Zhangyang Wang
While prior works on NeRFs optimize a scene representation by inverting a handcrafted rendering equation, GNT achieves neural representation and rendering that generalizes across scenes using transformers at two stages.
Ranked #1 on Generalizable Novel View Synthesis on LLFF
1 code implementation • 8 Jul 2022 • Peihao Wang, Zhiwen Fan, Tianlong Chen, Zhangyang Wang
In this paper, we present a generic INR framework that achieves both data and training efficiency by learning a Neural Implicit Dictionary (NID) from a data collection and representing INR as a functional combination of basis sampled from the dictionary.
1 code implementation • 7 Jul 2022 • Shiwei Liu, Tianlong Chen, Xiaohan Chen, Xuxi Chen, Qiao Xiao, Boqian Wu, Tommi Kärkkäinen, Mykola Pechenizkiy, Decebal Mocanu, Zhangyang Wang
Transformers have quickly shined in the computer vision world since the emergence of Vision Transformers (ViTs).
1 code implementation • CVPR 2022 • Tianlong Chen, Peihao Wang, Zhiwen Fan, Zhangyang Wang
Inspired by that, we propose Augmented NeRF (Aug-NeRF), which for the first time brings the power of robust data augmentations into regularizing the NeRF training.
1 code implementation • 26 Jun 2022 • Ajay Jaiswal, Haoyu Ma, Tianlong Chen, Ying Ding, Zhangyang Wang
Pruning large neural networks to create high-quality, independently trainable sparse masks, which can maintain similar performance to their dense counterparts, is very desirable due to the reduced space and time complexity.
no code implementations • 15 Jun 2022 • Tianlong Chen, Sijia Liu, Shiyu Chang, Lisa Amini, Zhangyang Wang
Inspired by the recent success of learning robust models with unlabeled data, we explore a new robustness-aware CIL setting, where the learned adversarial robustness has to resist forgetting and be transferred as new tasks come in continually.
1 code implementation • 15 Jun 2022 • Tianlong Chen, huan zhang, Zhenyu Zhang, Shiyu Chang, Sijia Liu, Pin-Yu Chen, Zhangyang Wang
Certifiable robustness is a highly desirable property for adopting deep neural networks (DNNs) in safety-critical scenarios, but often demands tedious computations to establish.
1 code implementation • 15 Jun 2022 • Zhangheng Li, Tianlong Chen, Linyi Li, Bo Li, Zhangyang Wang
Given the fact that neural networks are often over-parameterized, one effective way to reduce such computational overhead is neural network pruning, by removing redundant parameters from trained neural networks.
1 code implementation • 9 Jun 2022 • Tianlong Chen, Zhenyu Zhang, Sijia Liu, Yang Zhang, Shiyu Chang, Zhangyang Wang
For example, on downstream CIFAR-10/100 datasets, we identify double-win matching subnetworks with the standard, fast adversarial, and adversarial pre-training from ImageNet, at 89. 26%/73. 79%, 89. 26%/79. 03%, and 91. 41%/83. 22% sparsity, respectively.
1 code implementation • CVPR 2022 • Tianlong Chen, Zhenyu Zhang, Yihua Zhang, Shiyu Chang, Sijia Liu, Zhangyang Wang
Trojan attacks threaten deep neural networks (DNNs) by poisoning them to behave normally on most samples, yet to produce manipulated results for inputs attached with a particular trigger.
1 code implementation • 4 Apr 2022 • Diganta Misra, Bharat Runwal, Tianlong Chen, Zhangyang Wang, Irina Rish
With the latest advances in deep learning, there has been a lot of focus on the online learning paradigm due to its relevance in practical settings.
1 code implementation • ICLR 2022 • Shixing Yu, Tianlong Chen, Jiayi Shen, Huan Yuan, Jianchao Tan, Sen yang, Ji Liu, Zhangyang Wang
Vision transformers (ViTs) have gained popularity recently.
1 code implementation • ICLR 2022 • Wenqing Zheng, Tianlong Chen, Ting-Kuei Hu, Zhangyang Wang
Recent studies on Learning to Optimize (L2O) suggest a promising path to automating and accelerating the optimization procedure for complicated tasks.
1 code implementation • ICLR 2022 • Tianshu Huang, Tianlong Chen, Sijia Liu, Shiyu Chang, Lisa Amini, Zhangyang Wang
Selecting an appropriate optimizer for a given problem is of major interest for researchers and practitioners.
1 code implementation • CVPR 2022 • Tianlong Chen, Zhenyu Zhang, Yu Cheng, Ahmed Awadallah, Zhangyang Wang
However, a "head-to-toe assessment" regarding the extent of redundancy in ViTs, and how much we could gain by thoroughly mitigating such, has been absent for this field.
1 code implementation • 9 Mar 2022 • Peihao Wang, Wenqing Zheng, Tianlong Chen, Zhangyang Wang
The first technique, termed AttnScale, decomposes a self-attention block into low-pass and high-pass components, then rescales and combines these two filters to produce an all-pass self-attention matrix.
no code implementations • 5 Mar 2022 • Shiwei Liu, Yuesong Tian, Tianlong Chen, Li Shen
Even more unconventionally, our proposed method enables directly training sparse unbalanced GANs with an extremely sparse generator from scratch.
1 code implementation • ICLR 2022 • Tianlong Chen, Zhenyu Zhang, Pengjun Wang, Santosh Balachandra, Haoyu Ma, Zehao Wang, Zhangyang Wang
We introduce two alternatives for sparse adversarial training: (i) static sparsity, by leveraging recent results from the lottery ticket hypothesis to identify critical sparse subnetworks arising from the early training; (ii) dynamic sparsity, by allowing the sparse subnetwork to adaptively adjust its connectivity pattern (while sticking to the same sparsity ratio) throughout training.
1 code implementation • 9 Feb 2022 • Tianlong Chen, Xuxi Chen, Xiaolong Ma, Yanzhi Wang, Zhangyang Wang
The lottery ticket hypothesis (LTH) has shown that dense models contain highly sparse subnetworks (i. e., winning tickets) that can be trained in isolation to match full accuracy.
1 code implementation • ICLR 2022 • Shiwei Liu, Tianlong Chen, Xiaohan Chen, Li Shen, Decebal Constantin Mocanu, Zhangyang Wang, Mykola Pechenizkiy
In this paper, we focus on sparse training and highlight a perhaps counter-intuitive finding, that random pruning at initialization can be quite powerful for the sparse training of modern neural networks.
no code implementations • 17 Jan 2022 • Mengshu Sun, Haoyu Ma, Guoliang Kang, Yifan Jiang, Tianlong Chen, Xiaolong Ma, Zhangyang Wang, Yanzhi Wang
To the best of our knowledge, this is the first time quantization has been incorporated into ViT acceleration on FPGAs with the help of a fully automatic framework to guide the quantization strategy on the software side and the accelerator implementations on the hardware side given the target frame rate.
1 code implementation • 4 Jan 2022 • Yuning You, Tianlong Chen, Zhangyang Wang, Yang shen
Accordingly, we have extended the prefabricated discrete prior in the augmentation set, to a learnable continuous prior in the parameter space of graph generators, assuming that graph priors per se, similar to the concept of image manifolds, can be learned by data generation.
1 code implementation • CVPR 2022 • Zhiwen Fan, Tianlong Chen, Peihao Wang, Zhangyang Wang
CADTransformer tokenizes directly from the set of graphical primitives in CAD drawings, and correspondingly optimizes line-grained semantic and instance symbol spotting altogether by a pair of prediction heads.
1 code implementation • NeurIPS 2021 • Ziyu Jiang, Tianlong Chen, Ting Chen, Zhangyang Wang
Contrastive learning approaches have achieved great success in learning visual representations with few labels of the target classes.
1 code implementation • NeurIPS 2021 • Xuxi Chen, Tianlong Chen, Zhenyu Zhang, Zhangyang Wang
The lottery ticket hypothesis (LTH) emerges as a promising framework to leverage a special sparse subnetwork (i. e., winning ticket) instead of a full model for both training and inference, that can lower both costs without sacrificing the performance.
1 code implementation • 30 Oct 2021 • Xuxi Chen, Tianlong Chen, Weizhu Chen, Ahmed Hassan Awadallah, Zhangyang Wang, Yu Cheng
To address these pain points, we propose a framework for resource- and parameter-efficient fine-tuning by leveraging the sparsity prior in both weight updates and the final model weights.
no code implementations • 28 Oct 2021 • Haotian Xue, Kaixiong Zhou, Tianlong Chen, Kai Guo, Xia Hu, Yi Chang, Xin Wang
In this paper, we investigate GNNs from the lens of weight and feature loss landscapes, i. e., the loss changes with respect to model weights and node features, respectively.
1 code implementation • 9 Oct 2021 • Mu Yang, Shaojin Ding, Tianlong Chen, Tong Wang, Zhangyang Wang
This work presents a lifelong learning approach to train a multilingual Text-To-Speech (TTS) system, where each language was seen as an individual task and was learned sequentially and continually.
no code implementations • 7 Oct 2021 • William T. Redman, Tianlong Chen, Zhangyang Wang, Akshunna S. Dogra
Foundational work on the Lottery Ticket Hypothesis has suggested an exciting corollary: winning tickets found in the context of one task can be transferred to similar tasks, possibly even across different architectures.
no code implementations • 29 Sep 2021 • Yongduo Sui, Xiang Wang, Tianlong Chen, Xiangnan He, Tat-Seng Chua
In this work, we propose a simple and effective learning paradigm, Inductive Co-Pruning of GNNs (ICPG), to endow graph lottery tickets with inductive pruning capacity.
1 code implementation • ICLR 2022 • Lu Miao, Xiaolong Luo, Tianlong Chen, Wuyang Chen, Dong Liu, Zhangyang Wang
Conventional methods often require (iterative) pruning followed by re-training, which not only incurs large overhead beyond the original DNN training but also can be sensitive to retraining hyperparameters.
no code implementations • 29 Sep 2021 • Tianlong Chen, Xuxi Chen, Xiaolong Ma, Yanzhi Wang, Zhangyang Wang
The lottery ticket hypothesis (LTH) has shown that dense models contain highly sparse subnetworks (i. e., $\textit{winning tickets}$) that can be trained in isolation to match full accuracy.
no code implementations • 29 Sep 2021 • William T Redman, Tianlong Chen, Akshunna S. Dogra, Zhangyang Wang
Foundational work on the Lottery Ticket Hypothesis has suggested an exciting corollary: winning tickets found in the context of one task can be transferred to similar tasks, possibly even across different architectures.
no code implementations • ICLR 2022 • Shaojin Ding, Tianlong Chen, Zhangyang Wang
In this paper, we investigate the tantalizing possibility of using lottery ticket hypothesis to discover lightweight speech recognition models, that are (1) robust to various noise existing in speech; (2) transferable to fit the open-world personalization; and 3) compatible with structured sparsity.
no code implementations • ICLR 2022 • Peihao Wang, Wenqing Zheng, Tianlong Chen, Zhangyang Wang
The first technique, termed AttnScale, decomposes a self-attention block into low-pass and high-pass components, then rescales and combines these two filters to produce an all-pass self-attention matrix.
no code implementations • 29 Sep 2021 • Shiwei Liu, Yuesong Tian, Tianlong Chen, Li Shen
Perhaps most importantly, we find instead of inheriting parameters from expensive pre-trained GANs, directly training sparse GANs from scratch can be a much more efficient solution.
no code implementations • ICLR 2022 • Yuning You, Yue Cao, Tianlong Chen, Zhangyang Wang, Yang shen
Optimizing an objective function with uncertainty awareness is well-known to improve the accuracy and confidence of optimization solutions.
no code implementations • 29 Sep 2021 • Duc N.M Hoang, Kaixiong Zhou, Tianlong Chen, Xia Hu, Zhangyang Wang
Despite the preliminary success, we argue that for GNNs, NAS has to be customized further, due to the topological complicacy of GNN input data (graph) as well as the notorious training instability.
no code implementations • 29 Sep 2021 • Haoyu Ma, Yifan Huang, Tianlong Chen, Hao Tang, Chenyu You, Zhangyang Wang, Xiaohui Xie
However, it is unclear why the distorted distribution of the logits is catastrophic to the student model.
no code implementations • 29 Sep 2021 • Junjie Yang, Tianlong Chen, Mingkang Zhu, Fengxiang He, DaCheng Tao, Yingbin Liang, Zhangyang Wang
Learning to optimize (L2O) has gained increasing popularity in various optimization tasks, since classical optimizers usually require laborious, problem-specific design and hyperparameter tuning.
1 code implementation • 24 Aug 2021 • Tianlong Chen, Kaixiong Zhou, Keyu Duan, Wenqing Zheng, Peihao Wang, Xia Hu, Zhangyang Wang
In view of those, we present the first fair and reproducible benchmark dedicated to assessing the "tricks" of training deep GNNs.
no code implementations • 16 Jul 2021 • Chaojian Li, Wuyang Chen, Yuchen Gu, Tianlong Chen, Yonggan Fu, Zhangyang Wang, Yingyan Lin
Semantic segmentation for scene understanding is nowadays widely demanded, raising significant challenges for the algorithm efficiency, especially its applications on resource-limited platforms.
2 code implementations • NeurIPS 2021 • Xiaolong Ma, Geng Yuan, Xuan Shen, Tianlong Chen, Xuxi Chen, Xiaohan Chen, Ning Liu, Minghai Qin, Sijia Liu, Zhangyang Wang, Yanzhi Wang
Based on our analysis, we summarize a guideline for parameter settings in regards of specific architecture characteristics, which we hope to catalyze the research progress on the topic of lottery ticket hypothesis.
2 code implementations • ICLR 2022 • Shiwei Liu, Tianlong Chen, Zahra Atashgahi, Xiaohan Chen, Ghada Sokar, Elena Mocanu, Mykola Pechenizkiy, Zhangyang Wang, Decebal Constantin Mocanu
Our framework, FreeTickets, is defined as the ensemble of these relatively cheap sparse subnetworks.
1 code implementation • 24 Jun 2021 • Ting-Kuei Hu, Fernando Gama, Tianlong Chen, Wenqing Zheng, Zhangyang Wang, Alejandro Ribeiro, Brian M. Sadler
Our framework is implemented by a cascade of a convolutional and a graph neural network (CNN / GNN), addressing agent-level visual perception and feature learning, as well as swarm-level communication, local information aggregation and agent action inference, respectively.
2 code implementations • NeurIPS 2021 • Shiwei Liu, Tianlong Chen, Xiaohan Chen, Zahra Atashgahi, Lu Yin, Huanyu Kou, Li Shen, Mykola Pechenizkiy, Zhangyang Wang, Decebal Constantin Mocanu
Works on lottery ticket hypothesis (LTH) and single-shot network pruning (SNIP) have raised a lot of attention currently on post-training pruning (iterative magnitude pruning), and before-training pruning (pruning at initialization).
Ranked #3 on Sparse Learning on <h2>oi</h2>
1 code implementation • 10 Jun 2021 • Mingkang Zhu, Tianlong Chen, Zhangyang Wang
Compared to state-of-the-art methods, our homotopy attack leads to significantly fewer perturbations, e. g., reducing 42. 91% on CIFAR-10 and 75. 03% on ImageNet (average case, targeted attack), at similar maximal perturbation magnitudes, when still achieving 100% attack success rates.
2 code implementations • 10 Jun 2021 • Yuning You, Tianlong Chen, Yang shen, Zhangyang Wang
Unfortunately, unlike its counterpart on image data, the effectiveness of GraphCL hinges on ad-hoc data augmentations, which have to be manually picked per dataset, by either rules of thumb or trial-and-errors, owing to the diverse nature of graph data.
1 code implementation • NeurIPS 2021 • Tianlong Chen, Yu Cheng, Zhe Gan, Lu Yuan, Lei Zhang, Zhangyang Wang
For example, our sparsified DeiT-Small at (5%, 50%) sparsity for (data, architecture), improves 0. 28% top-1 accuracy, and meanwhile enjoys 49. 32% FLOPs and 4. 40% running time savings.
Ranked #20 on Efficient ViTs on ImageNet-1K (with DeiT-T)
1 code implementation • 6 Jun 2021 • Zhenyu Zhang, Xuxi Chen, Tianlong Chen, Zhangyang Wang
We observe that a high-quality winning ticket can be found with training and pruning the dense network on the very compact PrAC set, which can substantially save training iterations for the ticket finding process.
1 code implementation • 6 Jun 2021 • Ziyu Jiang, Tianlong Chen, Bobak Mortazavi, Zhangyang Wang
Hence, the key innovation in SDCLR is to create a dynamic self-competitor model to contrast with the target model, which is a pruned version of the latter.
1 code implementation • ICLR 2021 • Xuxi Chen, Zhenyu Zhang, Yongduo Sui, Tianlong Chen
In this work, we for the first time study the existence of such trainable matching subnetworks in deep GANs.
1 code implementation • NeurIPS 2021 • Ziyu Jiang, Tianlong Chen, Ting Chen, Zhangyang Wang
Contrastive learning approaches have achieved great success in learning visual representations with few labels of the target classes.
1 code implementation • ICLR 2021 • Haoyu Ma, Tianlong Chen, Ting-Kuei Hu, Chenyu You, Xiaohui Xie, Zhangyang Wang
Knowledge Distillation (KD) is a widely used technique to transfer knowledge from pre-trained teacher models to (usually more lightweight) student models.
no code implementations • CVPR 2021 • Zhihua Wang, Haotao Wang, Tianlong Chen, Zhangyang Wang, Kede Ma
Recently, the group maximum differentiation competition (gMAD) has been used to improve blind image quality assessment (BIQA) models, with the help of full-reference metrics.
no code implementations • 23 Apr 2021 • Zhe Gan, Yen-Chun Chen, Linjie Li, Tianlong Chen, Yu Cheng, Shuohang Wang, Jingjing Liu, Lijuan Wang, Zicheng Liu
However, we can find "relaxed" winning tickets at 50%-70% sparsity that maintain 99% of the full accuracy.
1 code implementation • 22 Apr 2021 • Arman Maesumi, Mingkang Zhu, Yi Wang, Tianlong Chen, Zhangyang Wang, Chandrajit Bajaj
This paper presents a novel patch-based adversarial attack pipeline that trains adversarial patches on 3D human meshes.
1 code implementation • 16 Apr 2021 • Tianlong Chen, Zhenyu Zhang, Xu Ouyang, Zechun Liu, Zhiqiang Shen, Zhangyang Wang
However, the BN layer is costly to calculate and is typically implemented with non-binary parameters, leaving a hurdle for the efficient implementation of BNN training.
Ranked #167 on Image Classification on CIFAR-10
1 code implementation • 23 Mar 2021 • Tianlong Chen, Xiaohan Chen, Wuyang Chen, Howard Heaton, Jialin Liu, Zhangyang Wang, Wotao Yin
It automates the design of an optimization method based on its performance on a set of training problems.
1 code implementation • 22 Mar 2021 • Tianlong Chen, Yu Cheng, Zhe Gan, JianFeng Wang, Lijuan Wang, Zhangyang Wang, Jingjing Liu
Recent advances in computer vision take advantage of adversarial data augmentation to ameliorate the generalization ability of classification models.
1 code implementation • NeurIPS 2021 • Tianlong Chen, Yu Cheng, Zhe Gan, Jingjing Liu, Zhangyang Wang
Training generative adversarial networks (GANs) with limited real image data generally results in deteriorated performance and collapsed models.
1 code implementation • 22 Feb 2021 • Xinyu Gong, Wuyang Chen, Tianlong Chen, Zhangyang Wang
We present Sandwich Batch Normalization (SaBN), a frustratingly easy improvement of Batch Normalization (BN) with only a few lines of code changes.
Ranked #20 on Neural Architecture Search on NAS-Bench-201, CIFAR-100
2 code implementations • 12 Feb 2021 • Tianlong Chen, Yongduo Sui, Xuxi Chen, Aston Zhang, Zhangyang Wang
With graphs rapidly growing in size and deeper graph neural networks (GNNs) emerging, the training and inference of GNNs become increasingly expensive.
1 code implementation • 8 Jan 2021 • Ajay Kumar Jaiswal, Haoyu Ma, Tianlong Chen, Ying Ding, Zhangyang Wang
In this paper, we demonstrate that it is unnecessary for spare retraining to strictly inherit those properties from the dense network.
no code implementations • 1 Jan 2021 • Haotao Wang, Tianlong Chen, Zhangyang Wang, Kede Ma
Image segmentation lays the foundation for many high-stakes vision applications such as autonomous driving and medical image analysis.
no code implementations • ICLR 2021 • Tianlong Chen, Zhenyu Zhang, Sijia Liu, Shiyu Chang, Zhangyang Wang
A recent study (Rice et al., 2020) revealed overfitting to be a dominant phenomenon in adversarially robust training of deep networks, and that appropriate early-stopping of adversarial training (AT) could match the performance gains of most recent algorithmic improvements.
no code implementations • ICLR 2021 • Jiayi Shen, Xiaohan Chen, Howard Heaton, Tianlong Chen, Jialin Liu, Wotao Yin, Zhangyang Wang
We first present Twin L2O, the first dedicated minimax L2O framework consisting of two LSTMs for updating min and max variables, respectively.
no code implementations • 1 Jan 2021 • Tianlong Chen, Yu Cheng, Zhe Gan, Yu Hu, Zhangyang Wang, Jingjing Liu
Adversarial training is an effective method to combat adversarial attacks in order to create robust neural networks.
no code implementations • ICLR 2021 • Tianlong Chen, Zhenyu Zhang, Sijia Liu, Shiyu Chang, Zhangyang Wang
In view of those, we introduce two pruning options, e. g., top-down and bottom-up, for finding lifelong tickets.
no code implementations • 1 Jan 2021 • Yue Cao, Tianlong Chen, Zhangyang Wang, Yang shen
Optimizing an objective function with uncertainty awareness is well-known to improve the accuracy and confidence of optimization solutions.
1 code implementation • CVPR 2021 • Tianlong Chen, Jonathan Frankle, Shiyu Chang, Sijia Liu, Yang Zhang, Michael Carbin, Zhangyang Wang
We extend the scope of LTH and question whether matching subnetworks still exist in pre-trained computer vision models, that enjoy the same downstream transfer performance.
1 code implementation • NeurIPS 2020 • Ziyu Jiang, Tianlong Chen, Ting Chen, Zhangyang Wang
Recent work has shown that, when integrated with adversarial training, self-supervised pre-training can lead to state-of-the-art robustness In this work, we improve robustness-aware self-supervised pre-training by learning representations that are consistent under both data augmentations and adversarial perturbations.
1 code implementation • NeurIPS 2020 • Haotao Wang, Tianlong Chen, Shupeng Gui, Ting-Kuei Hu, Ji Liu, Zhangyang Wang
The trained model could be adjusted among different standard and robust accuracies "for free" at testing time.
4 code implementations • NeurIPS 2020 • Yuning You, Tianlong Chen, Yongduo Sui, Ting Chen, Zhangyang Wang, Yang shen
In this paper, we propose a graph contrastive learning (GraphCL) framework for learning unsupervised representations of graph data.
1 code implementation • NeurIPS 2020 • Tianlong Chen, Weiyi Zhang, Jingyang Zhou, Shiyu Chang, Sijia Liu, Lisa Amini, Zhangyang Wang
Learning to optimize (L2O) has gained increasing attention since classical optimizers require laborious problem-specific design and hyperparameter tuning.
no code implementations • 6 Oct 2020 • Yuli Zheng, Zhenyu Wu, Ye Yuan, Tianlong Chen, Zhangyang Wang
While machine learning is increasingly used in this field, the resulting large-scale collection of user private information has reinvigorated the privacy debate, considering dozens of data breach incidents every year caused by unauthorized hackers, and (potentially even more) information misuse/abuse by authorized parties.
2 code implementations • NeurIPS 2020 • Tianlong Chen, Jonathan Frankle, Shiyu Chang, Sijia Liu, Yang Zhang, Zhangyang Wang, Michael Carbin
For a range of downstream tasks, we indeed find matching subnetworks at 40% to 90% sparsity.
1 code implementation • 25 Jun 2020 • Yi Wang, Jingyang Zhou, Tianlong Chen, Sijia Liu, Shiyu Chang, Chandrajit Bajaj, Zhangyang Wang
Contrary to the traditional adversarial patch, this new form of attack is mapped into the 3D object world and back-propagates to the 2D image domain through differentiable rendering.
1 code implementation • ICML 2020 • Xuxi Chen, Wuyang Chen, Tianlong Chen, Ye Yuan, Chen Gong, Kewei Chen, Zhangyang Wang
Many real-world applications have to tackle the Positive-Unlabeled (PU) learning problem, i. e., learning binary classifiers from a large amount of unlabeled data and a few labeled positive examples.
1 code implementation • ICML 2020 • Yuning You, Tianlong Chen, Zhangyang Wang, Yang shen
We first elaborate three mechanisms to incorporate self-supervision into GCNs, analyze the limitations of pretraining & finetuning and self-training, and proceed to focus on multi-task learning.
1 code implementation • 22 May 2020 • Prateek Shroff, Tianlong Chen, Yunchao Wei, Zhangyang Wang
In this paper, we tried to focus on these marginal differences to extract more representative features.
3 code implementations • 7 May 2020 • Shaojin Ding, Tianlong Chen, Xinyu Gong, Weiwei Zha, Zhangyang Wang
Speaker recognition systems based on Convolutional Neural Networks (CNNs) are often built with off-the-shelf backbones such as VGG-Net or ResNet.
Ranked #6 on Speaker Identification on VoxCeleb1 (using extra training data)
no code implementations • LREC 2020 • Xiaojing Yu, Tianlong Chen, Zhengjie Yu, Huiyu Li, Yang Yang, Xiaoqian Jiang, Anxiao Jiang
Compared to existing datasets, the queries in the dataset here are derived from the eligibility criteria of clinical trials and include \textit{Order-sensitive, Counting-based, and Boolean-type} cases which are not seen before.
2 code implementations • CVPR 2020 • Yuning You, Tianlong Chen, Zhangyang Wang, Yang shen
Graph convolution networks (GCN) are increasingly popular in many applications, yet remain notoriously hard to train over large graph datasets.
1 code implementation • CVPR 2020 • Tianlong Chen, Sijia Liu, Shiyu Chang, Yu Cheng, Lisa Amini, Zhangyang Wang
We conduct extensive experiments to demonstrate that the proposed framework achieves large performance margins (eg, 3. 83% on robust accuracy and 1. 3% on standard accuracy, on the CIFAR-10 dataset), compared with the conventional end-to-end adversarial training baseline.
1 code implementation • ICLR 2020 • Haotao Wang, Tianlong Chen, Zhangyang Wang, Kede Ma
On the other hand, the trained classifiers have traditionally been evaluated on small and fixed sets of test images, which are deemed to be extremely sparsely distributed in the space of all natural images.
2 code implementations • ICLR 2020 • Ting-Kuei Hu, Tianlong Chen, Haotao Wang, Zhangyang Wang
Deep networks were recently suggested to face the odds between accuracy (on clean natural images) and robustness (on adversarially perturbed images) (Tsipras et al., 2019).
no code implementations • 6 Feb 2020 • Ting-Kuei Hu, Fernando Gama, Tianlong Chen, Zhangyang Wang, Alejandro Ribeiro, Brian M. Sadler
More specifically, we consider that each robot has access to a visual perception of the immediate surroundings, and communication capabilities to transmit and receive messages from other neighboring robots.
1 code implementation • 26 Nov 2019 • Ye Yuan, Wuyang Chen, Tianlong Chen, Yang Yang, Zhou Ren, Zhangyang Wang, Gang Hua
Many real-world applications, such as city-scale traffic monitoring and control, requires large-scale re-identification.
1 code implementation • NeurIPS 2019 • Yue Cao, Tianlong Chen, Zhangyang Wang, Yang shen
Learning to optimize has emerged as a powerful framework for various optimization and machine learning tasks.
5 code implementations • ICCV 2019 • Tianlong Chen, Shaojin Ding, Jingyi Xie, Ye Yuan, Wuyang Chen, Yang Yang, Zhou Ren, Zhangyang Wang
Attention mechanism has been shown to be effective for person re-identification (Re-ID).
Ranked #16 on Person Re-Identification on Market-1501-C