Search Results for author: Tianlong Chen

Found 142 papers, 101 papers with code

HALO: Hardware-Aware Learning to Optimize

1 code implementation • ECCV 2020 • Chaojian Li, Tianlong Chen, Haoran You, Zhangyang Wang, Yingyan Lin

There has been an explosive demand for bringing machine learning (ML) powered intelligence into numerous Internet-of-Things (IoT) devices.

Paper
Code

Graph Sparsification via Mixture of Graphs

no code implementations • 23 May 2024 • Guibin Zhang, Xiangguo Sun, Yanwei Yue, Kun Wang, Tianlong Chen, Shirui Pan

Specifically, MoG incorporates multiple sparsifier experts, each characterized by unique sparsity levels and pruning criteria, and selects the appropriate experts for each node.

Graph Learning

Paper
Add Code

DALK: Dynamic Co-Augmentation of LLMs and KG to answer Alzheimer's Disease Questions with Scientific Literature

1 code implementation • 8 May 2024 • Dawei Li, Shu Yang, Zhen Tan, Jae Young Baik, Sukwon Yun, Joseph Lee, Aaron Chacko, BoJian Hou, Duy Duong-Tran, Ying Ding, Huan Liu, Li Shen, Tianlong Chen

With a synergized framework of LLM and KG mutually enhancing each other, we first leverage LLM to construct an evolving AD-specific knowledge graph (KG) sourced from AD-related scientific literature, and then we utilize a coarse-to-fine sampling method with a novel self-aware knowledge retrieval approach to select appropriate knowledge from the KG to augment LLM inference capabilities.

Question Answering

Paper
Code

Hybrid Quantum-Classical Scheduling for Accelerating Neural Network Training with Newton's Gradient Descent

1 code implementation • 30 Apr 2024 • Pingzhi Li, Junyu Liu, Hanrui Wang, Tianlong Chen

Nevertheless, one of its major bottlenecks is matrix inversion, which is notably time-consuming in $O(N^3)$ time with weak scalability.

Scheduling

Paper
Code

Facial Affective Behavior Analysis with Instruction Tuning

no code implementations • 7 Apr 2024 • YiFan Li, Anh Dao, Wentao Bao, Zhen Tan, Tianlong Chen, Huan Liu, Yu Kong

Our initiative on the dataset and benchmarks reveal the nature and rationale of facial affective behaviors, i. e., fine-grained facial movement, interpretability, and reasoning.

Instruction Following

Paper
Add Code

FFN-SkipLLM: A Hidden Gem for Autoregressive Decoding with Adaptive Feed Forward Skipping

no code implementations • 5 Apr 2024 • Ajay Jaiswal, Bodun Hu, Lu Yin, Yeonju Ro, Shiwei Liu, Tianlong Chen, Aditya Akella

In this work, we observed the saturation of computationally expensive feed-forward blocks of LLM layers and proposed FFN-SkipLLM, which is a novel fine-grained skip strategy of autoregressive LLMs.

Attribute Hallucination +1

Paper
Add Code

RESSA: Repair Sparse Vision-Language Models via Sparse Cross-Modality Adaptation

1 code implementation • 3 Apr 2024 • Shwai He, Tianlong Chen

Moreover, while parameter-efficient LoRA finetuning has been proposed to repair the performance of sparse models, a significant challenge of weights merging arises due to the incompatibility of dense LoRA modules with sparse models that destroy the sparsity of pruned models.

Knowledge Distillation

Paper
Code

Aurora-M: The First Open Source Multilingual Language Model Red-teamed according to the U.S. Executive Order

no code implementations • 30 Mar 2024 • Taishi Nakamura, Mayank Mishra, Simone Tedeschi, Yekun Chai, Jason T Stillerman, Felix Friedrich, Prateek Yadav, Tanmay Laud, Vu Minh Chien, Terry Yue Zhuo, Diganta Misra, Ben Bogin, Xuan-Son Vu, Marzena Karpinska, Arnav Varma Dantuluri, Wojciech Kusa, Tommaso Furlanello, Rio Yokota, Niklas Muennighoff, Suhas Pai, Tosin Adewumi, Veronika Laippala, Xiaozhe Yao, Adalberto Junior, Alpay Ariyak, Aleksandr Drozd, Jordan Clive, Kshitij Gupta, Liangyu Chen, Qi Sun, Ken Tsui, Noah Persaud, Nour Fahmy, Tianlong Chen, Mohit Bansal, Nicolo Monti, Tai Dang, Ziyang Luo, Tien-Tung Bui, Roberto Navigli, Virendra Mehta, Matthew Blumberg, Victor May, Huu Nguyen, Sampo Pyysalo

Pretrained language models underpin several AI applications, but their high computational cost for training limits accessibility.

Continual Pretraining Language Modelling

Paper
Add Code

Thought Graph: Generating Thought Process for Biological Reasoning

no code implementations • 11 Mar 2024 • Chi-Yang Hsu, Kyle Cox, Jiawei Xu, Zhen Tan, Tianhua Zhai, Mengzhou Hu, Dexter Pratt, Tianlong Chen, Ziniu Hu, Ying Ding

We present the Thought Graph as a novel framework to support complex reasoning and use gene set analysis as an example to uncover semantic relationships between biological processes.

Paper
Add Code

Tuning-Free Accountable Intervention for LLM Deployment -- A Metacognitive Approach

no code implementations • 8 Mar 2024 • Zhen Tan, Jie Peng, Tianlong Chen, Huan Liu

Large Language Models (LLMs) have catalyzed transformative advances across a spectrum of natural language processing tasks through few-shot or zero-shot prompting, bypassing the need for parameter tuning.

Decision Making Hallucination

Paper
Add Code

Privacy-preserving Fine-tuning of Large Language Models through Flatness

no code implementations • 7 Mar 2024 • Tiejin Chen, Longchao Da, Huixue Zhou, Pingzhi Li, Kaixiong Zhou, Tianlong Chen, Hua Wei

The privacy concerns associated with the use of Large Language Models (LLMs) have grown recently with the development of LLMs such as ChatGPT.

Knowledge Distillation Privacy Preserving +3

Paper
Add Code

GraphRCG: Self-conditioned Graph Generation via Bootstrapped Representations

no code implementations • 2 Mar 2024 • Song Wang, Zhen Tan, Xinyu Zhao, Tianlong Chen, Huan Liu, Jundong Li

In contrast, in this work, we propose a novel self-conditioned graph generation framework designed to explicitly model graph distributions and employ these distributions to guide the generation process.

Graph Generation

Paper
Add Code

MerRec: A Large-scale Multipurpose Mercari Dataset for Consumer-to-Consumer Recommendation Systems

1 code implementation • 22 Feb 2024 • Lichi Li, Zainul Abi Din, Zhen Tan, Sam London, Tianlong Chen, Ajay Daptardar

In the evolving e-commerce field, recommendation systems crucially shape user experience and engagement.

Click-Through Rate Prediction Multi-Task Learning +1

Paper
Code

Take the Bull by the Horns: Hard Sample-Reweighted Continual Training Improves LLM Generalization

1 code implementation • 22 Feb 2024 • Xuxi Chen, Zhendong Wang, Daouda Sow, Junjie Yang, Tianlong Chen, Yingbin Liang, Mingyuan Zhou, Zhangyang Wang

Our study starts from an empirical strategy for the light continual training of LLMs using their original pre-training data sets, with a specific focus on selective retention of samples that incur moderately high losses.

Paper
Code

Word-Sequence Entropy: Towards Uncertainty Estimation in Free-Form Medical Question Answering Applications and Beyond

no code implementations • 22 Feb 2024 • Zhiyuan Wang, Jinhao Duan, Chenxi Yuan, Qingyu Chen, Tianlong Chen, Huaxiu Yao, Yue Zhang, Ren Wang, Kaidi Xu, Xiaoshuang Shi

Uncertainty estimation plays a pivotal role in ensuring the reliability of safety-critical human-AI interaction systems, particularly in the medical domain.

Question Answering Uncertainty Quantification

Paper
Add Code

The Wolf Within: Covert Injection of Malice into MLLM Societies via an MLLM Operative

1 code implementation • 20 Feb 2024 • Zhen Tan, Chengshuai Zhao, Raha Moraffah, YiFan Li, Yu Kong, Tianlong Chen, Huan Liu

Unlike direct harmful output generation for MLLMs, our research demonstrates how a single MLLM agent can be subtly influenced to generate prompts that, in turn, induce other MLLM agents in the society to output malicious content.

Misinformation

Paper
Code

GTBench: Uncovering the Strategic Reasoning Limitations of LLMs via Game-Theoretic Evaluations

1 code implementation • 19 Feb 2024 • Jinhao Duan, Renming Zhang, James Diffenderfer, Bhavya Kailkhura, Lichao Sun, Elias Stengel-Eskin, Mohit Bansal, Tianlong Chen, Kaidi Xu

As Large Language Models (LLMs) are integrated into critical real-world applications, their strategic and logical reasoning abilities are increasingly crucial.

Card Games Logical Reasoning

Paper
Code

Revisiting Zeroth-Order Optimization for Memory-Efficient LLM Fine-Tuning: A Benchmark

1 code implementation • 18 Feb 2024 • Yihua Zhang, Pingzhi Li, Junyuan Hong, Jiaxiang Li, Yimeng Zhang, Wenqing Zheng, Pin-Yu Chen, Jason D. Lee, Wotao Yin, Mingyi Hong, Zhangyang Wang, Sijia Liu, Tianlong Chen

In the evolving landscape of natural language processing (NLP), fine-tuning pre-trained Large Language Models (LLMs) with first-order (FO) optimizers like SGD and Adam has become standard.

Benchmarking

Paper
Code

Two Heads Are Better Than One: Boosting Graph Sparse Training via Semantic and Topological Awareness

no code implementations • 2 Feb 2024 • Guibin Zhang, Yanwei Yue, Kun Wang, Junfeng Fang, Yongduo Sui, Kai Wang, Yuxuan Liang, Dawei Cheng, Shirui Pan, Tianlong Chen

Specifically, GST initially constructs a topology & semantic anchor at a low training cost, followed by performing dynamic sparse training to align the sparse graph with the anchor.

Adversarial Defense Graph Learning

Paper
Add Code

Contextualization Distillation from Large Language Model for Knowledge Graph Completion

1 code implementation • 28 Jan 2024 • Dawei Li, Zhen Tan, Tianlong Chen, Huan Liu

While textual information significantly enhances the performance of pre-trained language models (PLMs) in knowledge graph completion (KGC), the static and noisy nature of existing corpora collected from Wikipedia articles or synsets definitions often limits the potential of PLM-based KGC models.

Language Modelling Large Language Model

Paper
Code

QuantumSEA: In-Time Sparse Exploration for Noise Adaptive Quantum Circuits

1 code implementation • 10 Jan 2024 • Tianlong Chen, Zhenyu Zhang, Hanrui Wang, Jiaqi Gu, Zirui Li, David Z. Pan, Frederic T. Chong, Song Han, Zhangyang Wang

To address these two pain points, we propose QuantumSEA, an in-time sparse exploration for noise-adaptive quantum circuits, aiming to achieve two key objectives: (1) implicit circuits capacity during training - by dynamically exploring the circuit's sparse connectivity and sticking a fixed small number of quantum gates throughout the training which satisfies the coherence time and enjoy light noises, enabling feasible executions on real quantum devices; (2) noise robustness - by jointly optimizing the topology and parameters of quantum circuits under real device noise models.

Quantum Machine Learning

Paper
Code

TrustLLM: Trustworthiness in Large Language Models

1 code implementation • 10 Jan 2024 • Lichao Sun, Yue Huang, Haoran Wang, Siyuan Wu, Qihui Zhang, Yuan Li, Chujie Gao, Yixin Huang, Wenhan Lyu, Yixuan Zhang, Xiner Li, Zhengliang Liu, Yixin Liu, Yijue Wang, Zhikun Zhang, Bertie Vidgen, Bhavya Kailkhura, Caiming Xiong, Chaowei Xiao, Chunyuan Li, Eric Xing, Furong Huang, Hao liu, Heng Ji, Hongyi Wang, huan zhang, Huaxiu Yao, Manolis Kellis, Marinka Zitnik, Meng Jiang, Mohit Bansal, James Zou, Jian Pei, Jian Liu, Jianfeng Gao, Jiawei Han, Jieyu Zhao, Jiliang Tang, Jindong Wang, Joaquin Vanschoren, John Mitchell, Kai Shu, Kaidi Xu, Kai-Wei Chang, Lifang He, Lifu Huang, Michael Backes, Neil Zhenqiang Gong, Philip S. Yu, Pin-Yu Chen, Quanquan Gu, ran Xu, Rex Ying, Shuiwang Ji, Suman Jana, Tianlong Chen, Tianming Liu, Tianyi Zhou, William Wang, Xiang Li, Xiangliang Zhang, Xiao Wang, Xing Xie, Xun Chen, Xuyu Wang, Yan Liu, Yanfang Ye, Yinzhi Cao, Yong Chen, Yue Zhao

This paper introduces TrustLLM, a comprehensive study of trustworthiness in LLMs, including principles for different dimensions of trustworthiness, established benchmark, evaluation, and analysis of trustworthiness for mainstream LLMs, and discussion of open challenges and future directions.

Ethics Fairness

326

Paper
Code

Sparsity-Guided Holistic Explanation for LLMs with Interpretable Inference-Time Intervention

2 code implementations • 22 Dec 2023 • Zhen Tan, Tianlong Chen, Zhenyu Zhang, Huan Liu

Large Language Models (LLMs) have achieved unprecedented breakthroughs in various natural language processing domains.

Paper
Code

The Counterattack of CNNs in Self-Supervised Learning: Larger Kernel Size might be All You Need

no code implementations • 9 Dec 2023 • Tianjin Huang, Tianlong Chen, Zhangyang Wang, Shiwei Liu

Therefore, it remains unclear whether the self-attention operation is crucial for the recent advances in SSL - or CNNs can deliver the same excellence with more advanced designs, too?

Self-Supervised Learning

Paper
Add Code

Visual Prompting Upgrades Neural Network Sparsification: A Data-Model Perspective

1 code implementation • 3 Dec 2023 • Can Jin, Tianjin Huang, Yihua Zhang, Mykola Pechenizkiy, Sijia Liu, Shiwei Liu, Tianlong Chen

The rapid development of large-scale deep learning models questions the affordability of hardware platforms, which necessitates the pruning to reduce their computational and memory footprints.

Image Classification Visual Prompting

Paper
Code

Rethinking PGD Attack: Is Sign Function Necessary?

1 code implementation • 3 Dec 2023 • Junjie Yang, Tianlong Chen, Xuxi Chen, Zhangyang Wang, Yingbin Liang

Based on that, we further propose a new raw gradient descent (RGD) algorithm that eliminates the use of sign.

Paper
Code

TFMQ-DM: Temporal Feature Maintenance Quantization for Diffusion Models

1 code implementation • 27 Nov 2023 • Yushi Huang, Ruihao Gong, Jing Liu, Tianlong Chen, Xianglong Liu

Remarkably, our quantization approach, for the first time, achieves model performance nearly on par with the full-precision model under 4-bit weight quantization.

Denoising Image Generation +1

Paper
Code

SiRA: Sparse Mixture of Low Rank Adaptation

no code implementations • 15 Nov 2023 • Yun Zhu, Nevan Wichers, Chu-Cheng Lin, Xinyi Wang, Tianlong Chen, Lei Shu, Han Lu, Canoee Liu, Liangchen Luo, Jindong Chen, Lei Meng

Parameter Efficient Tuning has been an prominent approach to adapt the Large Language Model to downstream tasks.

Language Modelling Large Language Model

Paper
Add Code

Merge, Then Compress: Demystify Efficient SMoE with Hints from Its Routing Policy

1 code implementation • 2 Oct 2023 • Pingzhi Li, Zhenyu Zhang, Prateek Yadav, Yi-Lin Sung, Yu Cheng, Mohit Bansal, Tianlong Chen

Sparsely activated Mixture-of-Experts (SMoE) has shown promise to scale up the learning capacity of neural networks, however, they have issues like (a) High Memory Usage, due to duplication of the network layers into multiple copies as experts; and (b) Redundancy in Experts, as common learning-based routing policies suffer from representational collapse.

Paper
Code

Enhancing NeRF akin to Enhancing LLMs: Generalizable NeRF Transformer with Mixture-of-View-Experts

1 code implementation • ICCV 2023 • Wenyan Cong, Hanxue Liang, Peihao Wang, Zhiwen Fan, Tianlong Chen, Mukund Varma, Yi Wang, Zhangyang Wang

Cross-scene generalizable NeRF models, which can directly synthesize novel views of unseen scenes, have become a new spotlight of the NeRF field.

Novel View Synthesis

Paper
Code

Robust Mixture-of-Expert Training for Convolutional Neural Networks

1 code implementation • ICCV 2023 • Yihua Zhang, Ruisi Cai, Tianlong Chen, Guanhua Zhang, huan zhang, Pin-Yu Chen, Shiyu Chang, Zhangyang Wang, Sijia Liu

Since the lack of robustness has become one of the main hurdles for CNNs, in this paper we ask: How to adversarially robustify a CNN-based MoE model?

Adversarial Robustness

Paper
Code

Enhancing Adversarial Training via Reweighting Optimization Trajectory

1 code implementation • 25 Jun 2023 • Tianjin Huang, Shiwei Liu, Tianlong Chen, Meng Fang, Li Shen, Vlaod Menkovski, Lu Yin, Yulong Pei, Mykola Pechenizkiy

Despite the fact that adversarial training has become the de facto method for improving the robustness of deep neural networks, it is well-known that vanilla adversarial training suffers from daunting robust overfitting, resulting in unsatisfactory robust generalization.

Adversarial Robustness

Paper
Code

H$_2$O: Heavy-Hitter Oracle for Efficient Generative Inference of Large Language Models

1 code implementation • 24 Jun 2023 • Zhenyu Zhang, Ying Sheng, Tianyi Zhou, Tianlong Chen, Lianmin Zheng, Ruisi Cai, Zhao Song, Yuandong Tian, Christopher Ré, Clark Barrett, Zhangyang Wang, Beidi Chen

Based on these insights, we propose Heavy Hitter Oracle (H$_2$O), a KV cache eviction policy that dynamically retains a balance of recent and H$_2$ tokens.

296

Paper
Code

Instant Soup: Cheap Pruning Ensembles in A Single Pass Can Draw Lottery Tickets from Large Models

1 code implementation • 18 Jun 2023 • Ajay Jaiswal, Shiwei Liu, Tianlong Chen, Ying Ding, Zhangyang Wang

Motivated by the recent observations of model soups, which suggest that fine-tuned weights of multiple models can be merged to a better minima, we propose Instant Soup Pruning (ISP) to generate lottery ticket quality subnetworks, using a fraction of the original IMP cost by replacing the expensive intermediate pruning stages of IMP with computationally efficient weak mask generation and aggregation routine.

Paper
Code

Graph Ladling: Shockingly Simple Parallel GNN Training without Intermediate Communication

1 code implementation • 18 Jun 2023 • Ajay Jaiswal, Shiwei Liu, Tianlong Chen, Ying Ding, Zhangyang Wang

By dividing giant graph data, we build multiple independently and parallelly trained weaker GNNs (soup ingredient) without any intermediate communication, and combine their strength using a greedy interpolation soup procedure to achieve state-of-the-art performance.

graph partitioning Graph Sampling

Paper
Code

Sparsity May Cry: Let Us Fail (Current) Sparse Neural Networks Together!

1 code implementation • 3 Mar 2023 • Shiwei Liu, Tianlong Chen, Zhenyu Zhang, Xuxi Chen, Tianjin Huang, Ajay Jaiswal, Zhangyang Wang

In pursuit of a more general evaluation and unveiling the true potential of sparse algorithms, we introduce "Sparsity May Cry" Benchmark (SMC-Bench), a collection of carefully-curated 4 diverse tasks with 10 datasets, that accounts for capturing a wide range of domain-specific and sophisticated knowledge.

Paper
Code

Sparse MoE as the New Dropout: Scaling Dense and Self-Slimmable Transformers

1 code implementation • 2 Mar 2023 • Tianlong Chen, Zhenyu Zhang, Ajay Jaiswal, Shiwei Liu, Zhangyang Wang

Despite their remarkable achievement, gigantic transformers encounter significant drawbacks, including exorbitant computational and memory footprints during training, as well as severe collapse evidenced by a high degree of parameter redundancy.

Paper
Code

M-L2O: Towards Generalizable Learning-to-Optimize by Test-Time Fast Self-Adaptation

1 code implementation • 28 Feb 2023 • Junjie Yang, Xuxi Chen, Tianlong Chen, Zhangyang Wang, Yingbin Liang

This data-driven procedure yields L2O that can efficiently solve problems similar to those seen in training, that is, drawn from the same ``task distribution".

Paper
Code

Learning to Generalize Provably in Learning to Optimize

1 code implementation • 22 Feb 2023 • Junjie Yang, Tianlong Chen, Mingkang Zhu, Fengxiang He, DaCheng Tao, Yingbin Liang, Zhangyang Wang

While the optimizer generalization has been recently studied, the optimizee generalization (or learning to generalize) has not been rigorously studied in the L2O context, which is the aim of this paper.

252

Paper
Code

AdaMV-MoE: Adaptive Multi-Task Vision Mixture-of-Experts

1 code implementation • ICCV 2023 • Tianlong Chen, Xuxi Chen, Xianzhi Du, Abdullah Rashwan, Fan Yang, Huizhong Chen, Zhangyang Wang, Yeqing Li

Instead of compressing multiple tasks' knowledge into a single model, MoE separates the parameter space and only utilizes the relevant model pieces given task type and its input, which provides stabilized MTL training and ultra-efficient inference.

Instance Segmentation Multi-Task Learning +3

33,128

Paper
Code

Attend Who is Weak: Pruning-assisted Medical Image Localization under Sophisticated and Implicit Imbalances

no code implementations • 6 Dec 2022 • Ajay Jaiswal, Tianlong Chen, Justin F. Rousseau, Yifan Peng, Ying Ding, Zhangyang Wang

However, DNNs are notoriously fragile to the class imbalance in image classification.

Image Classification Network Pruning

Paper
Add Code

You Can Have Better Graph Neural Networks by Not Training Weights at All: Finding Untrained GNNs Tickets

1 code implementation • 28 Nov 2022 • Tianjin Huang, Tianlong Chen, Meng Fang, Vlado Menkovski, Jiaxu Zhao, Lu Yin, Yulong Pei, Decebal Constantin Mocanu, Zhangyang Wang, Mykola Pechenizkiy, Shiwei Liu

Recent works have impressively demonstrated that there exists a subnetwork in randomly initialized convolutional neural networks (CNNs) that can match the performance of the fully trained dense networks at initialization, without any optimization of the weights of the network (i. e., untrained networks).

Out-of-Distribution Detection

Paper
Code

Peeling the Onion: Hierarchical Reduction of Data Redundancy for Efficient Vision Transformer Training

1 code implementation • 19 Nov 2022 • Zhenglun Kong, Haoyu Ma, Geng Yuan, Mengshu Sun, Yanyue Xie, Peiyan Dong, Xin Meng, Xuan Shen, Hao Tang, Minghai Qin, Tianlong Chen, Xiaolong Ma, Xiaohui Xie, Zhangyang Wang, Yanzhi Wang

Vision transformers (ViTs) have recently obtained success in many applications, but their intensive computation and heavy memory usage at both training and inference time limit their generalization.

Paper
Code

QuanGCN: Noise-Adaptive Training for Robust Quantum Graph Convolutional Networks

no code implementations • 9 Nov 2022 • Kaixiong Zhou, Zhenyu Zhang, Shengyuan Chen, Tianlong Chen, Xiao Huang, Zhangyang Wang, Xia Hu

Quantum neural networks (QNNs), an interdisciplinary field of quantum computing and machine learning, have attracted tremendous research interests due to the specific quantum advantages.

Paper
Add Code

Sparse Winning Tickets are Data-Efficient Image Recognizers

1 code implementation • NIPS 2022 • Mukund Varma T, Xuxi Chen, Zhenyu Zhang, Tianlong Chen, Subhashini Venugopalan, Zhangyang Wang

Improving the performance of deep networks in data-limited regimes has warranted much attention.

Paper
Code

M$^3$ViT: Mixture-of-Experts Vision Transformer for Efficient Multi-task Learning with Model-Accelerator Co-design

1 code implementation • 26 Oct 2022 • Hanxue Liang, Zhiwen Fan, Rishov Sarkar, Ziyu Jiang, Tianlong Chen, Kai Zou, Yu Cheng, Cong Hao, Zhangyang Wang

However, when deploying MTL onto those real-world systems that are often resource-constrained or latency-sensitive, two prominent challenges arise: (i) during training, simultaneously optimizing all tasks is often difficult due to gradient conflicts across tasks; (ii) at inference, current MTL regimes have to activate nearly the entire model even to just execute a single task.

Multi-Task Learning

Paper
Code

Old can be Gold: Better Gradient Flow can Make Vanilla-GCNs Great Again

1 code implementation • 14 Oct 2022 • Ajay Jaiswal, Peihao Wang, Tianlong Chen, Justin F. Rousseau, Ying Ding, Zhangyang Wang

In this paper, firstly, we provide a new perspective of gradient flow to understand the substandard performance of deep GCNs and hypothesize that by facilitating healthy gradient flow, we can significantly improve their trainability, as well as achieve state-of-the-art (SOTA) level performance from vanilla-GCNs.

Paper
Code

A Comprehensive Study on Large-Scale Graph Training: Benchmarking and Rethinking

2 code implementations • 14 Oct 2022 • Keyu Duan, Zirui Liu, Peihao Wang, Wenqing Zheng, Kaixiong Zhou, Tianlong Chen, Xia Hu, Zhangyang Wang

Large-scale graph training is a notoriously challenging problem for graph neural networks (GNNs).

Ranked #2 on Node Property Prediction on ogbn-products

Benchmarking Node Classification +1

Paper
Code

Advancing Model Pruning via Bi-level Optimization

1 code implementation • 8 Oct 2022 • Yihua Zhang, Yuguang Yao, Parikshit Ram, Pu Zhao, Tianlong Chen, Mingyi Hong, Yanzhi Wang, Sijia Liu

To reduce the computation overhead, various efficient 'one-shot' pruning methods have been developed, but these schemes are usually unable to find winning tickets as good as IMP.

138

Paper
Code

Augmentations in Hypergraph Contrastive Learning: Fabricated and Generative

1 code implementation • 7 Oct 2022 • Tianxin Wei, Yuning You, Tianlong Chen, Yang shen, Jingrui He, Zhangyang Wang

This paper targets at improving the generalizability of hypergraph neural networks in the low-label regime, through applying the contrastive learning approach from images/graphs (we refer to it as HyperGCL).

Contrastive Learning Fairness +2

Paper
Code

Can We Solve 3D Vision Tasks Starting from A 2D Vision Transformer?

2 code implementations • 15 Sep 2022 • Yi Wang, Zhiwen Fan, Tianlong Chen, Hehe Fan, Zhangyang Wang

Vision Transformers (ViTs) have proven to be effective, in solving 2D image understanding tasks by training over large-scale image datasets; and meanwhile as a somehow separate track, in modeling the 3D visual world too such as voxels or point clouds.

Point Cloud Segmentation

Paper
Code

Is Attention All That NeRF Needs?

1 code implementation • 27 Jul 2022 • Mukund Varma T, Peihao Wang, Xuxi Chen, Tianlong Chen, Subhashini Venugopalan, Zhangyang Wang

While prior works on NeRFs optimize a scene representation by inverting a handcrafted rendering equation, GNT achieves neural representation and rendering that generalizes across scenes using transformers at two stages.

Ranked #1 on Generalizable Novel View Synthesis on LLFF

Generalizable Novel View Synthesis Inductive Bias +1

331

Paper
Code

Neural Implicit Dictionary via Mixture-of-Expert Training

1 code implementation • 8 Jul 2022 • Peihao Wang, Zhiwen Fan, Tianlong Chen, Zhangyang Wang

In this paper, we present a generic INR framework that achieves both data and training efficiency by learning a Neural Implicit Dictionary (NID) from a data collection and representing INR as a functional combination of basis sampled from the dictionary.

Image Inpainting

Paper
Code

More ConvNets in the 2020s: Scaling up Kernels Beyond 51x51 using Sparsity

1 code implementation • 7 Jul 2022 • Shiwei Liu, Tianlong Chen, Xiaohan Chen, Xuxi Chen, Qiao Xiao, Boqian Wu, Tommi Kärkkäinen, Mykola Pechenizkiy, Decebal Mocanu, Zhangyang Wang

Transformers have quickly shined in the computer vision world since the emergence of Vision Transformers (ViTs).

Object Detection Semantic Segmentation

255

Paper
Code

Aug-NeRF: Training Stronger Neural Radiance Fields with Triple-Level Physically-Grounded Augmentations

1 code implementation • CVPR 2022 • Tianlong Chen, Peihao Wang, Zhiwen Fan, Zhangyang Wang

Inspired by that, we propose Augmented NeRF (Aug-NeRF), which for the first time brings the power of robust data augmentations into regularizing the NeRF training.

Novel View Synthesis Out-of-Distribution Generalization

123

Paper
Code

Training Your Sparse Neural Network Better with Any Mask

1 code implementation • 26 Jun 2022 • Ajay Jaiswal, Haoyu Ma, Tianlong Chen, Ying Ding, Zhangyang Wang

Pruning large neural networks to create high-quality, independently trainable sparse masks, which can maintain similar performance to their dense counterparts, is very desirable due to the reduced space and time complexity.

Paper
Code

Queried Unlabeled Data Improves and Robustifies Class-Incremental Learning

no code implementations • 15 Jun 2022 • Tianlong Chen, Sijia Liu, Shiyu Chang, Lisa Amini, Zhangyang Wang

Inspired by the recent success of learning robust models with unlabeled data, we explore a new robustness-aware CIL setting, where the learned adversarial robustness has to resist forgetting and be transferred as new tasks come in continually.

Adversarial Robustness Class Incremental Learning +1

Paper
Add Code

Linearity Grafting: Relaxed Neuron Pruning Helps Certifiable Robustness

1 code implementation • 15 Jun 2022 • Tianlong Chen, huan zhang, Zhenyu Zhang, Shiyu Chang, Sijia Liu, Pin-Yu Chen, Zhangyang Wang

Certifiable robustness is a highly desirable property for adopting deep neural networks (DNNs) in safety-critical scenarios, but often demands tedious computations to establish.

Paper
Code

Can pruning improve certified robustness of neural networks?

1 code implementation • 15 Jun 2022 • Zhangheng Li, Tianlong Chen, Linyi Li, Bo Li, Zhangyang Wang

Given the fact that neural networks are often over-parameterized, one effective way to reduce such computational overhead is neural network pruning, by removing redundant parameters from trained neural networks.

Network Pruning

Paper
Code

Data-Efficient Double-Win Lottery Tickets from Robust Pre-training

1 code implementation • 9 Jun 2022 • Tianlong Chen, Zhenyu Zhang, Sijia Liu, Yang Zhang, Shiyu Chang, Zhangyang Wang

For example, on downstream CIFAR-10/100 datasets, we identify double-win matching subnetworks with the standard, fast adversarial, and adversarial pre-training from ImageNet, at 89. 26%/73. 79%, 89. 26%/79. 03%, and 91. 41%/83. 22% sparsity, respectively.

Transfer Learning

Paper
Code

Quarantine: Sparsity Can Uncover the Trojan Attack Trigger for Free

1 code implementation • CVPR 2022 • Tianlong Chen, Zhenyu Zhang, Yihua Zhang, Shiyu Chang, Sijia Liu, Zhangyang Wang

Trojan attacks threaten deep neural networks (DNNs) by poisoning them to behave normally on most samples, yet to produce manipulated results for inputs attached with a particular trigger.

Network Pruning

Paper
Code

APP: Anytime Progressive Pruning

1 code implementation • 4 Apr 2022 • Diganta Misra, Bharat Runwal, Tianlong Chen, Zhangyang Wang, Irina Rish

With the latest advances in deep learning, there has been a lot of focus on the online learning paradigm due to its relevance in practical settings.

Network Pruning Sparse Learning

Paper
Code

Unified Visual Transformer Compression

1 code implementation • ICLR 2022 • Shixing Yu, Tianlong Chen, Jiayi Shen, Huan Yuan, Jianchao Tan, Sen yang, Ji Liu, Zhangyang Wang

Vision transformers (ViTs) have gained popularity recently.

Knowledge Distillation

Paper
Code

Symbolic Learning to Optimize: Towards Interpretability and Scalability

1 code implementation • ICLR 2022 • Wenqing Zheng, Tianlong Chen, Ting-Kuei Hu, Zhangyang Wang

Recent studies on Learning to Optimize (L2O) suggest a promising path to automating and accelerating the optimization procedure for complicated tasks.

Symbolic Regression

Paper
Code

Optimizer Amalgamation

1 code implementation • ICLR 2022 • Tianshu Huang, Tianlong Chen, Sijia Liu, Shiyu Chang, Lisa Amini, Zhangyang Wang

Selecting an appropriate optimizer for a given problem is of major interest for researchers and practitioners.

Paper
Code

The Principle of Diversity: Training Stronger Vision Transformers Calls for Reducing All Levels of Redundancy

1 code implementation • CVPR 2022 • Tianlong Chen, Zhenyu Zhang, Yu Cheng, Ahmed Awadallah, Zhangyang Wang

However, a "head-to-toe assessment" regarding the extent of redundancy in ViTs, and how much we could gain by thoroughly mitigating such, has been absent for this field.

Paper
Code

Anti-Oversmoothing in Deep Vision Transformers via the Fourier Domain Analysis: From Theory to Practice

1 code implementation • 9 Mar 2022 • Peihao Wang, Wenqing Zheng, Tianlong Chen, Zhangyang Wang

The first technique, termed AttnScale, decomposes a self-attention block into low-pass and high-pass components, then rescales and combines these two filters to produce an all-pass self-attention matrix.

Paper
Code

Don't Be So Dense: Sparse-to-Sparse GAN Training Without Sacrificing Performance

no code implementations • 5 Mar 2022 • Shiwei Liu, Yuesong Tian, Tianlong Chen, Li Shen

Even more unconventionally, our proposed method enables directly training sparse unbalanced GANs with an extremely sparse generator from scratch.

Model Compression

Paper
Add Code

Sparsity Winning Twice: Better Robust Generalization from More Efficient Training

1 code implementation • ICLR 2022 • Tianlong Chen, Zhenyu Zhang, Pengjun Wang, Santosh Balachandra, Haoyu Ma, Zehao Wang, Zhangyang Wang

We introduce two alternatives for sparse adversarial training: (i) static sparsity, by leveraging recent results from the lottery ticket hypothesis to identify critical sparse subnetworks arising from the early training; (ii) dynamic sparsity, by allowing the sparse subnetwork to adaptively adjust its connectivity pattern (while sticking to the same sparsity ratio) throughout training.

Paper
Code

Coarsening the Granularity: Towards Structurally Sparse Lottery Tickets

1 code implementation • 9 Feb 2022 • Tianlong Chen, Xuxi Chen, Xiaolong Ma, Yanzhi Wang, Zhangyang Wang

The lottery ticket hypothesis (LTH) has shown that dense models contain highly sparse subnetworks (i. e., winning tickets) that can be trained in isolation to match full accuracy.

Paper
Code

The Unreasonable Effectiveness of Random Pruning: Return of the Most Naive Baseline for Sparse Training

1 code implementation • ICLR 2022 • Shiwei Liu, Tianlong Chen, Xiaohan Chen, Li Shen, Decebal Constantin Mocanu, Zhangyang Wang, Mykola Pechenizkiy

In this paper, we focus on sparse training and highlight a perhaps counter-intuitive finding, that random pruning at initialization can be quite powerful for the sparse training of modern neural networks.

Adversarial Robustness Out-of-Distribution Detection

Paper
Code

VAQF: Fully Automatic Software-Hardware Co-Design Framework for Low-Bit Vision Transformer

no code implementations • 17 Jan 2022 • Mengshu Sun, Haoyu Ma, Guoliang Kang, Yifan Jiang, Tianlong Chen, Xiaolong Ma, Zhangyang Wang, Yanzhi Wang

To the best of our knowledge, this is the first time quantization has been incorporated into ViT acceleration on FPGAs with the help of a fully automatic framework to guide the quantization strategy on the software side and the accelerator implementations on the hardware side given the target frame rate.

Quantization

Paper
Add Code

Bringing Your Own View: Graph Contrastive Learning without Prefabricated Data Augmentations

1 code implementation • 4 Jan 2022 • Yuning You, Tianlong Chen, Zhangyang Wang, Yang shen

Accordingly, we have extended the prefabricated discrete prior in the augmentation set, to a learnable continuous prior in the parameter space of graph generators, assuming that graph priors per se, similar to the concept of image manifolds, can be learned by data generation.

Contrastive Learning Graph Learning

108

Paper
Code

CADTransformer: Panoptic Symbol Spotting Transformer for CAD Drawings

1 code implementation • CVPR 2022 • Zhiwen Fan, Tianlong Chen, Peihao Wang, Zhangyang Wang

CADTransformer tokenizes directly from the set of graphical primitives in CAD drawings, and correspondingly optimizes line-grained semantic and instance symbol spotting altogether by a pair of prediction heads.

Data Augmentation

Paper
Code

Improving Contrastive Learning on Imbalanced Seed Data via Open-World Sampling

1 code implementation • NeurIPS 2021 • Ziyu Jiang, Tianlong Chen, Ting Chen, Zhangyang Wang

Contrastive learning approaches have achieved great success in learning visual representations with few labels of the target classes.

Attribute Contrastive Learning

Paper
Code

You are caught stealing my winning lottery ticket! Making a lottery ticket claim its ownership

1 code implementation • NeurIPS 2021 • Xuxi Chen, Tianlong Chen, Zhenyu Zhang, Zhangyang Wang

The lottery ticket hypothesis (LTH) emerges as a promising framework to leverage a special sparse subnetwork (i. e., winning ticket) instead of a full model for both training and inference, that can lower both costs without sacrificing the performance.

Paper
Code

DSEE: Dually Sparsity-embedded Efficient Tuning of Pre-trained Language Models

1 code implementation • 30 Oct 2021 • Xuxi Chen, Tianlong Chen, Weizhu Chen, Ahmed Hassan Awadallah, Zhangyang Wang, Yu Cheng

To address these pain points, we propose a framework for resource- and parameter-efficient fine-tuning by leveraging the sparsity prior in both weight updates and the final model weights.

Paper
Code

CAP: Co-Adversarial Perturbation on Weights and Features for Improving Generalization of Graph Neural Networks

no code implementations • 28 Oct 2021 • Haotian Xue, Kaixiong Zhou, Tianlong Chen, Kai Guo, Xia Hu, Yi Chang, Xin Wang

In this paper, we investigate GNNs from the lens of weight and feature loss landscapes, i. e., the loss changes with respect to model weights and node features, respectively.

Paper
Add Code

Towards Lifelong Learning of Multilingual Text-To-Speech Synthesis

1 code implementation • 9 Oct 2021 • Mu Yang, Shaojin Ding, Tianlong Chen, Tong Wang, Zhangyang Wang

This work presents a lifelong learning approach to train a multilingual Text-To-Speech (TTS) system, where each language was seen as an individual task and was learned sequentially and continually.

Speech Synthesis Text-To-Speech Synthesis

Paper
Code

Universality of Winning Tickets: A Renormalization Group Perspective

no code implementations • 7 Oct 2021 • William T. Redman, Tianlong Chen, Zhangyang Wang, Akshunna S. Dogra

Foundational work on the Lottery Ticket Hypothesis has suggested an exciting corollary: winning tickets found in the context of one task can be transferred to similar tasks, possibly even across different architectures.

Paper
Add Code

Inductive Lottery Ticket Learning for Graph Neural Networks

no code implementations • 29 Sep 2021 • Yongduo Sui, Xiang Wang, Tianlong Chen, Xiangnan He, Tat-Seng Chua

In this work, we propose a simple and effective learning paradigm, Inductive Co-Pruning of GNNs (ICPG), to endow graph lottery tickets with inductive pruning capacity.

Graph Classification Node Classification +1

Paper
Add Code

Learning Pruning-Friendly Networks via Frank-Wolfe: One-Shot, Any-Sparsity, And No Retraining

1 code implementation • ICLR 2022 • Lu Miao, Xiaolong Luo, Tianlong Chen, Wuyang Chen, Dong Liu, Zhangyang Wang

Conventional methods often require (iterative) pruning followed by re-training, which not only incurs large overhead beyond the original DNN training but also can be sensitive to retraining hyperparameters.

Paper
Code

Lottery Tickets can have Structural Sparsity

no code implementations • 29 Sep 2021 • Tianlong Chen, Xuxi Chen, Xiaolong Ma, Yanzhi Wang, Zhangyang Wang

The lottery ticket hypothesis (LTH) has shown that dense models contain highly sparse subnetworks (i. e., $\textit{winning tickets}$) that can be trained in isolation to match full accuracy.

Paper
Add Code

Universality of Deep Neural Network Lottery Tickets: A Renormalization Group Perspective

no code implementations • 29 Sep 2021 • William T Redman, Tianlong Chen, Akshunna S. Dogra, Zhangyang Wang

Paper
Add Code

Audio Lottery: Speech Recognition Made Ultra-Lightweight, Noise-Robust, and Transferable

no code implementations • ICLR 2022 • Shaojin Ding, Tianlong Chen, Zhangyang Wang

In this paper, we investigate the tantalizing possibility of using lottery ticket hypothesis to discover lightweight speech recognition models, that are (1) robust to various noise existing in speech; (2) transferable to fit the open-world personalization; and 3) compatible with structured sparsity.

speech-recognition Speech Recognition

Paper
Add Code

Scaling the Depth of Vision Transformers via the Fourier Domain Analysis

no code implementations • ICLR 2022 • Peihao Wang, Wenqing Zheng, Tianlong Chen, Zhangyang Wang

Paper
Add Code

Sparse Unbalanced GAN Training with In-Time Over-Parameterization

no code implementations • 29 Sep 2021 • Shiwei Liu, Yuesong Tian, Tianlong Chen, Li Shen

Perhaps most importantly, we find instead of inheriting parameters from expensive pre-trained GANs, directly training sparse GANs from scratch can be a much more efficient solution.

Model Compression

Paper
Add Code

Bayesian Modeling and Uncertainty Quantification for Learning to Optimize: What, Why, and How

no code implementations • ICLR 2022 • Yuning You, Yue Cao, Tianlong Chen, Zhangyang Wang, Yang shen

Optimizing an objective function with uncertainty awareness is well-known to improve the accuracy and confidence of optimization solutions.

Uncertainty Quantification Variational Inference

Paper
Add Code

AutoCoG: A Unified Data-Modal Co-Search Framework for Graph Neural Networks

no code implementations • 29 Sep 2021 • Duc N.M Hoang, Kaixiong Zhou, Tianlong Chen, Xia Hu, Zhangyang Wang

Despite the preliminary success, we argue that for GNNs, NAS has to be customized further, due to the topological complicacy of GNN input data (graph) as well as the notorious training instability.

Data Augmentation Language Modelling +1

Paper
Add Code

Stingy Teacher: Sparse Logits Suffice to Fail Knowledge Distillation

no code implementations • 29 Sep 2021 • Haoyu Ma, Yifan Huang, Tianlong Chen, Hao Tang, Chenyu You, Zhangyang Wang, Xiaohui Xie

However, it is unclear why the distorted distribution of the logits is catastrophic to the student model.

Knowledge Distillation

Paper
Add Code

Generalizable Learning to Optimize into Wide Valleys

no code implementations • 29 Sep 2021 • Junjie Yang, Tianlong Chen, Mingkang Zhu, Fengxiang He, DaCheng Tao, Yingbin Liang, Zhangyang Wang

Learning to optimize (L2O) has gained increasing popularity in various optimization tasks, since classical optimizers usually require laborious, problem-specific design and hyperparameter tuning.

Paper
Add Code

Bag of Tricks for Training Deeper Graph Neural Networks: A Comprehensive Benchmark Study

1 code implementation • 24 Aug 2021 • Tianlong Chen, Kaixiong Zhou, Keyu Duan, Wenqing Zheng, Peihao Wang, Xia Hu, Zhangyang Wang

In view of those, we present the first fair and reproducible benchmark dedicated to assessing the "tricks" of training deep GNNs.

126

Paper
Code

DANCE: DAta-Network Co-optimization for Efficient Segmentation Model Training and Inference

no code implementations • 16 Jul 2021 • Chaojian Li, Wuyang Chen, Yuchen Gu, Tianlong Chen, Yonggan Fu, Zhangyang Wang, Yingyan Lin

Semantic segmentation for scene understanding is nowadays widely demanded, raising significant challenges for the algorithm efficiency, especially its applications on resource-limited platforms.

Scene Understanding Segmentation +1

Paper
Add Code

Sanity Checks for Lottery Tickets: Does Your Winning Ticket Really Win the Jackpot?

2 code implementations • NeurIPS 2021 • Xiaolong Ma, Geng Yuan, Xuan Shen, Tianlong Chen, Xuxi Chen, Xiaohan Chen, Ning Liu, Minghai Qin, Sijia Liu, Zhangyang Wang, Yanzhi Wang

Based on our analysis, we summarize a guideline for parameter settings in regards of specific architecture characteristics, which we hope to catalyze the research progress on the topic of lottery ticket hypothesis.

138

Paper
Code

Deep Ensembling with No Overhead for either Training or Testing: The All-Round Blessings of Dynamic Sparsity

2 code implementations • ICLR 2022 • Shiwei Liu, Tianlong Chen, Zahra Atashgahi, Xiaohan Chen, Ghada Sokar, Elena Mocanu, Mykola Pechenizkiy, Zhangyang Wang, Decebal Constantin Mocanu

Our framework, FreeTickets, is defined as the ensemble of these relatively cheap sparse subnetworks.

Ensemble Learning

Paper
Code

Scalable Perception-Action-Communication Loops with Convolutional and Graph Neural Networks

1 code implementation • 24 Jun 2021 • Ting-Kuei Hu, Fernando Gama, Tianlong Chen, Wenqing Zheng, Zhangyang Wang, Alejandro Ribeiro, Brian M. Sadler

Our framework is implemented by a cascade of a convolutional and a graph neural network (CNN / GNN), addressing agent-level visual perception and feature learning, as well as swarm-level communication, local information aggregation and agent action inference, respectively.

Imitation Learning

Paper
Code

Sparse Training via Boosting Pruning Plasticity with Neuroregeneration

2 code implementations • NeurIPS 2021 • Shiwei Liu, Tianlong Chen, Xiaohan Chen, Zahra Atashgahi, Lu Yin, Huanyu Kou, Li Shen, Mykola Pechenizkiy, Zhangyang Wang, Decebal Constantin Mocanu

Works on lottery ticket hypothesis (LTH) and single-shot network pruning (SNIP) have raised a lot of attention currently on post-training pruning (iterative magnitude pruning), and before-training pruning (pruning at initialization).

Ranked #3 on Sparse Learning on <h2>oi</h2>

Network Pruning Sparse Learning

Paper
Code

Sparse and Imperceptible Adversarial Attack via a Homotopy Algorithm

1 code implementation • 10 Jun 2021 • Mingkang Zhu, Tianlong Chen, Zhangyang Wang

Compared to state-of-the-art methods, our homotopy attack leads to significantly fewer perturbations, e. g., reducing 42. 91% on CIFAR-10 and 75. 03% on ImageNet (average case, targeted attack), at similar maximal perturbation magnitudes, when still achieving 100% attack success rates.

Adversarial Attack

Paper
Code

Graph Contrastive Learning Automated

2 code implementations • 10 Jun 2021 • Yuning You, Tianlong Chen, Yang shen, Zhangyang Wang

Unfortunately, unlike its counterpart on image data, the effectiveness of GraphCL hinges on ad-hoc data augmentations, which have to be manually picked per dataset, by either rules of thumb or trial-and-errors, owing to the diverse nature of graph data.

Contrastive Learning Representation Learning +1

149

Paper
Code

Chasing Sparsity in Vision Transformers: An End-to-End Exploration

1 code implementation • NeurIPS 2021 • Tianlong Chen, Yu Cheng, Zhe Gan, Lu Yuan, Lei Zhang, Zhangyang Wang

For example, our sparsified DeiT-Small at (5%, 50%) sparsity for (data, architecture), improves 0. 28% top-1 accuracy, and meanwhile enjoys 49. 32% FLOPs and 4. 40% running time savings.

Ranked #20 on Efficient ViTs on ImageNet-1K (with DeiT-T)

Efficient ViTs

Paper
Code

Efficient Lottery Ticket Finding: Less Data is More

1 code implementation • 6 Jun 2021 • Zhenyu Zhang, Xuxi Chen, Tianlong Chen, Zhangyang Wang

We observe that a high-quality winning ticket can be found with training and pruning the dense network on the very compact PrAC set, which can substantially save training iterations for the ticket finding process.

Paper
Code

Self-Damaging Contrastive Learning

1 code implementation • 6 Jun 2021 • Ziyu Jiang, Tianlong Chen, Bobak Mortazavi, Zhangyang Wang

Hence, the key innovation in SDCLR is to create a dynamic self-competitor model to contrast with the target model, which is a pruned version of the latter.

Contrastive Learning Linear evaluation +2

Paper
Code

GANs Can Play Lottery Tickets Too

1 code implementation • ICLR 2021 • Xuxi Chen, Zhenyu Zhang, Yongduo Sui, Tianlong Chen

In this work, we for the first time study the existence of such trainable matching subnetworks in deep GANs.

Image-to-Image Translation

Paper
Code

Improving Contrastive Learning on Imbalanced Data via Open-World Sampling

1 code implementation • NeurIPS 2021 • Ziyu Jiang, Tianlong Chen, Ting Chen, Zhangyang Wang

Contrastive learning approaches have achieved great success in learning visual representations with few labels of the target classes.

Attribute Contrastive Learning

Paper
Code

Undistillable: Making A Nasty Teacher That CANNOT teach students

1 code implementation • ICLR 2021 • Haoyu Ma, Tianlong Chen, Ting-Kuei Hu, Chenyu You, Xiaohui Xie, Zhangyang Wang

Knowledge Distillation (KD) is a widely used technique to transfer knowledge from pre-trained teacher models to (usually more lightweight) student models.

Knowledge Distillation

Paper
Code

Troubleshooting Blind Image Quality Models in the Wild

no code implementations • CVPR 2021 • Zhihua Wang, Haotao Wang, Tianlong Chen, Zhangyang Wang, Kede Ma

Recently, the group maximum differentiation competition (gMAD) has been used to improve blind image quality assessment (BIQA) models, with the help of full-reference metrics.

Blind Image Quality Assessment Network Pruning

Paper
Add Code

Playing Lottery Tickets with Vision and Language

no code implementations • 23 Apr 2021 • Zhe Gan, Yen-Chun Chen, Linjie Li, Tianlong Chen, Yu Cheng, Shuohang Wang, Jingjing Liu, Lijuan Wang, Zicheng Liu

However, we can find "relaxed" winning tickets at 50%-70% sparsity that maintain 99% of the full accuracy.

Question Answering Referring Expression +6

Paper
Add Code

Learning Transferable 3D Adversarial Cloaks for Deep Trained Detectors

1 code implementation • 22 Apr 2021 • Arman Maesumi, Mingkang Zhu, Yi Wang, Tianlong Chen, Zhangyang Wang, Chandrajit Bajaj

This paper presents a novel patch-based adversarial attack pipeline that trains adversarial patches on 3D human meshes.

Adversarial Attack Object

Paper
Code

"BNN - BN = ?": Training Binary Neural Networks without Batch Normalization

1 code implementation • 16 Apr 2021 • Tianlong Chen, Zhenyu Zhang, Xu Ouyang, Zechun Liu, Zhiqiang Shen, Zhangyang Wang

However, the BN layer is costly to calculate and is typically implemented with non-binary parameters, leaving a hurdle for the efficient implementation of BNN training.

Ranked #167 on Image Classification on CIFAR-10

Image Classification

Paper
Code

Learning to Optimize: A Primer and A Benchmark

1 code implementation • 23 Mar 2021 • Tianlong Chen, Xiaohan Chen, Wuyang Chen, Howard Heaton, Jialin Liu, Zhangyang Wang, Wotao Yin

It automates the design of an optimization method based on its performance on a set of training problems.

Benchmarking

252

Paper
Code

Adversarial Feature Augmentation and Normalization for Visual Recognition

1 code implementation • 22 Mar 2021 • Tianlong Chen, Yu Cheng, Zhe Gan, JianFeng Wang, Lijuan Wang, Zhangyang Wang, Jingjing Liu

Recent advances in computer vision take advantage of adversarial data augmentation to ameliorate the generalization ability of classification models.

Classification Data Augmentation +2

Paper
Code

Data-Efficient GAN Training Beyond (Just) Augmentations: A Lottery Ticket Perspective

1 code implementation • NeurIPS 2021 • Tianlong Chen, Yu Cheng, Zhe Gan, Jingjing Liu, Zhangyang Wang

Training generative adversarial networks (GANs) with limited real image data generally results in deteriorated performance and collapsed models.

Data Augmentation

Paper
Code

Sandwich Batch Normalization: A Drop-In Replacement for Feature Distribution Heterogeneity

1 code implementation • 22 Feb 2021 • Xinyu Gong, Wuyang Chen, Tianlong Chen, Zhangyang Wang

We present Sandwich Batch Normalization (SaBN), a frustratingly easy improvement of Batch Normalization (BN) with only a few lines of code changes.

Ranked #20 on Neural Architecture Search on NAS-Bench-201, CIFAR-100

Adversarial Defense Conditional Image Generation +2

Paper
Code

A Unified Lottery Ticket Hypothesis for Graph Neural Networks

2 code implementations • 12 Feb 2021 • Tianlong Chen, Yongduo Sui, Xuxi Chen, Aston Zhang, Zhangyang Wang

With graphs rapidly growing in size and deeper graph neural networks (GNNs) emerging, the training and inference of GNNs become increasingly expensive.

Link Prediction Node Classification

Paper
Code

Spending Your Winning Lottery Better After Drawing It

1 code implementation • 8 Jan 2021 • Ajay Kumar Jaiswal, Haoyu Ma, Tianlong Chen, Ying Ding, Zhangyang Wang

In this paper, we demonstrate that it is unnecessary for spare retraining to strictly inherit those properties from the dense network.

Knowledge Distillation

Paper
Code

Efficiently Troubleshooting Image Segmentation Models with Human-In-The-Loop

no code implementations • 1 Jan 2021 • Haotao Wang, Tianlong Chen, Zhangyang Wang, Kede Ma

Image segmentation lays the foundation for many high-stakes vision applications such as autonomous driving and medical image analysis.

Autonomous Driving Image Segmentation +2

Paper
Add Code

Robust Overfitting may be mitigated by properly learned smoothening

no code implementations • ICLR 2021 • Tianlong Chen, Zhenyu Zhang, Sijia Liu, Shiyu Chang, Zhangyang Wang

A recent study (Rice et al., 2020) revealed overfitting to be a dominant phenomenon in adversarially robust training of deep networks, and that appropriate early-stopping of adversarial training (AT) could match the performance gains of most recent algorithmic improvements.

Knowledge Distillation

Paper
Add Code

Learning A Minimax Optimizer: A Pilot Study

no code implementations • ICLR 2021 • Jiayi Shen, Xiaohan Chen, Howard Heaton, Tianlong Chen, Jialin Liu, Wotao Yin, Zhangyang Wang

We first present Twin L2O, the first dedicated minimax L2O framework consisting of two LSTMs for updating min and max variables, respectively.

Paper
Add Code

ALFA: Adversarial Feature Augmentation for Enhanced Image Recognition

no code implementations • 1 Jan 2021 • Tianlong Chen, Yu Cheng, Zhe Gan, Yu Hu, Zhangyang Wang, Jingjing Liu

Adversarial training is an effective method to combat adversarial attacks in order to create robust neural networks.

Paper
Add Code

Long Live the Lottery: The Existence of Winning Tickets in Lifelong Learning

no code implementations • ICLR 2021 • Tianlong Chen, Zhenyu Zhang, Sijia Liu, Shiyu Chang, Zhangyang Wang

In view of those, we introduce two pruning options, e. g., top-down and bottom-up, for finding lifelong tickets.

Class Incremental Learning Incremental Learning +2

Paper
Add Code

Bayesian Learning to Optimize: Quantifying the Optimizer Uncertainty

no code implementations • 1 Jan 2021 • Yue Cao, Tianlong Chen, Zhangyang Wang, Yang shen

Optimizing an objective function with uncertainty awareness is well-known to improve the accuracy and confidence of optimization solutions.

Image Classification Uncertainty Quantification +1

Paper
Add Code

The Lottery Tickets Hypothesis for Supervised and Self-supervised Pre-training in Computer Vision Models

1 code implementation • CVPR 2021 • Tianlong Chen, Jonathan Frankle, Shiyu Chang, Sijia Liu, Yang Zhang, Michael Carbin, Zhangyang Wang

We extend the scope of LTH and question whether matching subnetworks still exist in pre-trained computer vision models, that enjoy the same downstream transfer performance.

Paper
Code

Robust Pre-Training by Adversarial Contrastive Learning

1 code implementation • NeurIPS 2020 • Ziyu Jiang, Tianlong Chen, Ting Chen, Zhangyang Wang

Recent work has shown that, when integrated with adversarial training, self-supervised pre-training can lead to state-of-the-art robustness In this work, we improve robustness-aware self-supervised pre-training by learning representations that are consistent under both data augmentations and adversarial perturbations.

Adversarial Robustness Contrastive Learning

109

Paper
Code

Once-for-All Adversarial Training: In-Situ Tradeoff between Robustness and Accuracy for Free

1 code implementation • NeurIPS 2020 • Haotao Wang, Tianlong Chen, Shupeng Gui, Ting-Kuei Hu, Ji Liu, Zhangyang Wang

The trained model could be adjusted among different standard and robust accuracies "for free" at testing time.

Paper
Code

Graph Contrastive Learning with Augmentations

4 code implementations • NeurIPS 2020 • Yuning You, Tianlong Chen, Yongduo Sui, Ting Chen, Zhangyang Wang, Yang shen

In this paper, we propose a graph contrastive learning (GraphCL) framework for learning unsupervised representations of graph data.

Contrastive Learning Representation Learning +2

530

Paper
Code

Training Stronger Baselines for Learning to Optimize

1 code implementation • NeurIPS 2020 • Tianlong Chen, Weiyi Zhang, Jingyang Zhou, Shiyu Chang, Sijia Liu, Lisa Amini, Zhangyang Wang

Learning to optimize (L2O) has gained increasing attention since classical optimizers require laborious problem-specific design and hyperparameter tuning.

Imitation Learning Rolling Shutter Correction

Paper
Code

PCAL: A Privacy-preserving Intelligent Credit Risk Modeling Framework Based on Adversarial Learning

no code implementations • 6 Oct 2020 • Yuli Zheng, Zhenyu Wu, Ye Yuan, Tianlong Chen, Zhangyang Wang

While machine learning is increasingly used in this field, the resulting large-scale collection of user private information has reinvigorated the privacy debate, considering dozens of data breach incidents every year caused by unauthorized hackers, and (potentially even more) information misuse/abuse by authorized parties.

BIG-bench Machine Learning Privacy Preserving

Paper
Add Code

The Lottery Ticket Hypothesis for Pre-trained BERT Networks

2 code implementations • NeurIPS 2020 • Tianlong Chen, Jonathan Frankle, Shiyu Chang, Sijia Liu, Yang Zhang, Zhangyang Wang, Michael Carbin

For a range of downstream tasks, we indeed find matching subnetworks at 40% to 90% sparsity.

Language Modelling Masked Language Modeling

138

Paper
Code

Can 3D Adversarial Logos Cloak Humans?

1 code implementation • 25 Jun 2020 • Yi Wang, Jingyang Zhou, Tianlong Chen, Sijia Liu, Shiyu Chang, Chandrajit Bajaj, Zhangyang Wang

Contrary to the traditional adversarial patch, this new form of attack is mapped into the 3D object world and back-propagates to the 2D image domain through differentiable rendering.

Object

Paper
Code

Self-PU: Self Boosted and Calibrated Positive-Unlabeled Training

1 code implementation • ICML 2020 • Xuxi Chen, Wuyang Chen, Tianlong Chen, Ye Yuan, Chen Gong, Kewei Chen, Zhangyang Wang

Many real-world applications have to tackle the Positive-Unlabeled (PU) learning problem, i. e., learning binary classifiers from a large amount of unlabeled data and a few labeled positive examples.

Paper
Code

When Does Self-Supervision Help Graph Convolutional Networks?

1 code implementation • ICML 2020 • Yuning You, Tianlong Chen, Zhangyang Wang, Yang shen

We first elaborate three mechanisms to incorporate self-supervision into GCNs, analyze the limitations of pretraining & finetuning and self-training, and proceed to focus on multi-task learning.

Multi-Task Learning Representation Learning +1

106

Paper
Code

Focus Longer to See Better:Recursively Refined Attention for Fine-Grained Image Classification

1 code implementation • 22 May 2020 • Prateek Shroff, Tianlong Chen, Yunchao Wei, Zhangyang Wang

In this paper, we tried to focus on these marginal differences to extract more representative features.

Fine-Grained Image Classification General Classification

Paper
Code

AutoSpeech: Neural Architecture Search for Speaker Recognition

3 code implementations • 7 May 2020 • Shaojin Ding, Tianlong Chen, Xinyu Gong, Weiwei Zha, Zhangyang Wang

Speaker recognition systems based on Convolutional Neural Networks (CNNs) are often built with off-the-shelf backbones such as VGG-Net or ResNet.

Ranked #6 on Speaker Identification on VoxCeleb1 (using extra training data)

Image Classification Neural Architecture Search +3

206

Paper
Code

Dataset and Enhanced Model for Eligibility Criteria-to-SQL Semantic Parsing

no code implementations • LREC 2020 • Xiaojing Yu, Tianlong Chen, Zhengjie Yu, Huiyu Li, Yang Yang, Xiaoqian Jiang, Anxiao Jiang

Compared to existing datasets, the queries in the dataset here are derived from the eligibility criteria of clinical trials and include \textit{Order-sensitive, Counting-based, and Boolean-type} cases which are not seen before.

Semantic Parsing Text-To-SQL

Paper
Add Code

L^2-GCN: Layer-Wise and Learned Efficient Training of Graph Convolutional Networks

2 code implementations • CVPR 2020 • Yuning You, Tianlong Chen, Zhangyang Wang, Yang shen

Graph convolution networks (GCN) are increasingly popular in many applications, yet remain notoriously hard to train over large graph datasets.

Paper
Code

Adversarial Robustness: From Self-Supervised Pre-Training to Fine-Tuning

1 code implementation • CVPR 2020 • Tianlong Chen, Sijia Liu, Shiyu Chang, Yu Cheng, Lisa Amini, Zhangyang Wang

We conduct extensive experiments to demonstrate that the proposed framework achieves large performance margins (eg, 3. 83% on robust accuracy and 1. 3% on standard accuracy, on the CIFAR-10 dataset), compared with the conventional end-to-end adversarial training baseline.

Adversarial Robustness

Paper
Code

I Am Going MAD: Maximum Discrepancy Competition for Comparing Classifiers Adaptively

1 code implementation • ICLR 2020 • Haotao Wang, Tianlong Chen, Zhangyang Wang, Kede Ma

On the other hand, the trained classifiers have traditionally been evaluated on small and fixed sets of test images, which are deemed to be extremely sparsely distributed in the space of all natural images.

Image Classification

Paper
Code

Triple Wins: Boosting Accuracy, Robustness and Efficiency Together by Enabling Input-Adaptive Inference

2 code implementations • ICLR 2020 • Ting-Kuei Hu, Tianlong Chen, Haotao Wang, Zhangyang Wang

Deep networks were recently suggested to face the odds between accuracy (on clean natural images) and robustness (on adversarially perturbed images) (Tsipras et al., 2019).

Paper
Code

VGAI: End-to-End Learning of Vision-Based Decentralized Controllers for Robot Swarms

no code implementations • 6 Feb 2020 • Ting-Kuei Hu, Fernando Gama, Tianlong Chen, Zhangyang Wang, Alejandro Ribeiro, Brian M. Sadler

More specifically, we consider that each robot has access to a visual perception of the immediate surroundings, and communication capabilities to transmit and receive messages from other neighboring robots.

Paper
Add Code

Calibrated Domain-Invariant Learning for Highly Generalizable Large Scale Re-Identification

1 code implementation • 26 Nov 2019 • Ye Yuan, Wuyang Chen, Tianlong Chen, Yang Yang, Zhou Ren, Zhangyang Wang, Gang Hua

Many real-world applications, such as city-scale traffic monitoring and control, requires large-scale re-identification.

Paper
Code

Learning to Optimize in Swarms

1 code implementation • NeurIPS 2019 • Yue Cao, Tianlong Chen, Zhangyang Wang, Yang shen

Learning to optimize has emerged as a powerful framework for various optimization and machine learning tasks.

Paper
Code

ABD-Net: Attentive but Diverse Person Re-Identification

5 code implementations • ICCV 2019 • Tianlong Chen, Shaojin Ding, Jingyi Xie, Ye Yuan, Wuyang Chen, Yang Yang, Zhou Ren, Zhangyang Wang

Attention mechanism has been shown to be effective for person re-identification (Re-ID).

Ranked #16 on Person Re-Identification on Market-1501-C

Person Re-Identification Retrieval

611

Paper
Code

Cannot find the paper you are looking for? You can Submit a new open access paper.