Search Results for author: Tianlong Chen

Found 212 papers, 135 papers with code

HALO: Hardware-Aware Learning to Optimize

1 code implementation ECCV 2020 Chaojian Li, Tianlong Chen, Haoran You, Zhangyang Wang, Yingyan Lin

There has been an explosive demand for bringing machine learning (ML) powered intelligence into numerous Internet-of-Things (IoT) devices.

DOGe: Defensive Output Generation for LLM Protection Against Knowledge Distillation

1 code implementation26 May 2025 Pingzhi Li, Zhen Tan, Huaizhi Qu, Huan Liu, Tianlong Chen

Our experiments show that, while preserving or even improving the original performance of the teacher model, student models distilled from the defensively generated teacher outputs demonstrate catastrophically reduced performance, demonstrating our method's effectiveness as a practical safeguard against KD-based model imitation.

Knowledge Distillation

I2MoE: Interpretable Multimodal Interaction-aware Mixture-of-Experts

1 code implementation25 May 2025 Jiayi Xin, Sukwon Yun, Jie Peng, Inyoung Choi, Jenna L. Ballard, Tianlong Chen, Qi Long

Modality fusion is a cornerstone of multimodal learning, enabling information integration from diverse data sources.

Mixture-of-Experts multimodal interaction

The Quest for Efficient Reasoning: A Data-Centric Benchmark to CoT Distillation

no code implementations24 May 2025 Ruichen Zhang, Rana Muhammad Shahroz Khan, Zhen Tan, Dawei Li, Song Wang, Tianlong Chen

Data-centric distillation, including data augmentation, selection, and mixing, offers a promising path to creating smaller, more efficient student Large Language Models (LLMs) that retain strong reasoning abilities.

Data Augmentation

GroverGPT-2: Simulating Grover's Algorithm via Chain-of-Thought Reasoning and Quantum-Native Tokenization

no code implementations8 May 2025 Min Chen, Jinglei Cheng, Pingzhi Li, Haoran Wang, Tianlong Chen, Junyu Liu

Our results show that GroverGPT-2 can learn and internalize quantum circuit logic through efficient processing of quantum-native tokens, providing direct evidence that classical models like LLMs can capture the structure of quantum algorithms.

Unlearning Sensitive Information in Multimodal LLMs: Benchmark and Attack-Defense Evaluation

1 code implementation1 May 2025 Vaidehi Patil, Yi-Lin Sung, Peter Hase, Jie Peng, Tianlong Chen, Mohit Bansal

To address this gap, we first introduce a multimodal unlearning benchmark, UnLOK-VQA (Unlearning Outside Knowledge VQA), as well as an attack-and-defense framework to evaluate methods for deleting specific multimodal knowledge from MLLMs.

Question Answering Specificity +1

A Comprehensive Survey in LLM(-Agent) Full Stack Safety: Data, Training and Deployment

no code implementations22 Apr 2025 Kun Wang, Guibin Zhang, Zhenhong Zhou, Jiahao Wu, Miao Yu, Shiqian Zhao, Chenlong Yin, Jinhu Fu, Yibo Yan, Hanjun Luo, Liang Lin, Zhihao Xu, Haolang Lu, Xinye Cao, Xinyun Zhou, Weifei Jin, Fanci Meng, Junyuan Mao, Yu Wang, Hao Wu, Minghe Wang, Fan Zhang, Junfeng Fang, Wenjie Qu, Yue Liu, Chengwei Liu, Yifan Zhang, Qiankun Li, Chongye Guo, Yalan Qin, Zhaoxin Fan, Yi Ding, Donghai Hong, Jiaming Ji, Yingxin Lai, Zitong Yu, Xinfeng Li, Yifan Jiang, Yanhui Li, Xinyu Deng, Junlin Wu, Dongxia Wang, Yihao Huang, Yufei Guo, Jen-tse Huang, Qiufeng Wang, Wenxuan Wang, Dongrui Liu, Yanwei Yue, Wenke Huang, Guancheng Wan, Heng Chang, Tianlin Li, Yi Yu, Chenghao Li, Jiawei Li, Lei Bai, Jie Zhang, Qing Guo, Jingyi Wang, Tianlong Chen, Joey Tianyi Zhou, Xiaojun Jia, Weisong Sun, Cong Wu, Jing Chen, Xuming Hu, Yiming Li, Xiao Wang, Ningyu Zhang, Luu Anh Tuan, Guowen Xu, Jiaheng Zhang, Tianwei Zhang, Xingjun Ma, Jindong Gu, Xiang Wang, Bo An, Jun Sun, Mohit Bansal, Shirui Pan, Lingjuan Lyu, Yuval Elovici, Bhavya Kailkhura, Yaodong Yang, Hongwei Li, Wenyuan Xu, Yizhou Sun, Wei Wang, Qing Li, Ke Tang, Yu-Gang Jiang, Felix Juefei-Xu, Hui Xiong, XiaoFeng Wang, DaCheng Tao, Philip S. Yu, Qingsong Wen, Yang Liu

Currently, existing surveys on LLM safety primarily focus on specific stages of the LLM lifecycle, e. g., deployment phase or fine-tuning phase, lacking a comprehensive understanding of the entire "lifechain" of LLMs.

Model Editing

SConU: Selective Conformal Uncertainty in Large Language Models

no code implementations19 Apr 2025 Zhiyuan Wang, Qingni Wang, Yue Zhang, Tianlong Chen, Xiaofeng Zhu, Xiaoshuang Shi, Kaidi Xu

As large language models are increasingly utilized in real-world applications, guarantees of task-specific metrics are essential for their reliable deployment.

Conformal Prediction Question Answering

Efficient MAP Estimation of LLM Judgment Performance with Prior Transfer

no code implementations17 Apr 2025 Huaizhi Qu, Inyoung Choi, Zhen Tan, Song Wang, Sukwon Yun, Qi Long, Faizan Siddiqui, Kwonjoon Lee, Tianlong Chen

Finally, we present BetaConform, a framework that integrates our distribution assumption, adaptive stopping, and the prior transfer mechanism to deliver a theoretically guaranteed distribution estimation of LLM ensemble judgment with minimum labeled samples.

Conformal Prediction TruthfulQA

Finding Fantastic Experts in MoEs: A Unified Study for Expert Dropping Strategies and Observations

no code implementations8 Apr 2025 Ajay Jaiswal, Jianyu Wang, Yixiao Li, Pingzhi Li, Tianlong Chen, Zhangyang Wang, Chong Wang, Ruoming Pang, Xianzhi Du

Moreover, we introduce the benefits of task-agnostic fine-tuning as a correction mechanism during iterative expert dropping, which we term MoE Lottery Subnetworks.

Instruction Following Mixture-of-Experts

Window Token Concatenation for Efficient Visual Large Language Models

1 code implementation5 Apr 2025 YiFan Li, Wentao Bao, Botao Ye, Zhen Tan, Tianlong Chen, Huan Liu, Yu Kong

To further enhance the performance on fine-grained visual understanding tasks, we introduce WiCo+, which decomposes the visual tokens in later layers of the LLM.

Token Reduction

More is Less: The Pitfalls of Multi-Model Synthetic Preference Data in DPO Safety Alignment

no code implementations3 Apr 2025 Yifan Wang, Runjin Chen, Bolian Li, David Cho, Yihe Deng, Ruqi Zhang, Tianlong Chen, Zhangyang Wang, Ananth Grama, Junyuan Hong

The issue is particularly pronounced when employing stronger models like GPT-4o or larger models in the same family to generate chosen responses paired with target model self-generated rejected responses, resulting in dramatically poorer safety outcomes.

ARC HellaSwag +4

Advancing MoE Efficiency: A Collaboration-Constrained Routing (C2R) Strategy for Better Expert Parallelism Design

no code implementations2 Apr 2025 Mohan Zhang, Pingzhi Li, Jie Peng, Mufan Qiu, Tianlong Chen

By employing a gating network to route input tokens, it selectively activates a subset of expert networks to process the corresponding token embeddings.

Attribute Mixture-of-Experts

$\textit{Agents Under Siege}$: Breaking Pragmatic Multi-Agent LLM Systems with Optimized Prompt Attacks

no code implementations31 Mar 2025 Rana Muhammad Shahroz Khan, Zhen Tan, Sukwon Yun, Charles Flemming, Tianlong Chen

Most discussions about Large Language Model (LLM) safety have focused on single-agent settings but multi-agent LLM systems now create novel adversarial risks because their behavior depends on communication between agents and decentralized reasoning.

Adversarial Attack Large Language Model

ORAL: Prompting Your Large-Scale LoRAs via Conditional Recurrent Diffusion

no code implementations31 Mar 2025 Rana Muhammad Shahroz Khan, Dongwen Tang, Pingzhi Li, Kai Wang, Tianlong Chen

Parameter generation has emerged as a novel paradigm for neural network development, offering an alternative to traditional neural network training by synthesizing high-quality model weights directly.

In Prospect and Retrospect: Reflective Memory Management for Long-term Personalized Dialogue Agents

no code implementations11 Mar 2025 Zhen Tan, Jun Yan, I-Hung Hsu, Rujun Han, Zifeng Wang, Long T. Le, Yiwen Song, Yanfei Chen, Hamid Palangi, George Lee, Anand Iyer, Tianlong Chen, Huan Liu, Chen-Yu Lee, Tomas Pfister

Large Language Models (LLMs) have made significant progress in open-ended dialogue, yet their inability to retain and retrieve relevant information from long-term interactions limits their effectiveness in applications requiring sustained personalization.

Management Reinforcement Learning (RL) +1

You Only Debias Once: Towards Flexible Accuracy-Fairness Trade-offs at Inference Time

1 code implementation10 Mar 2025 Xiaotian Han, Tianlong Chen, Kaixiong Zhou, Zhimeng Jiang, Zhangyang Wang, Xia Hu

Instead of pursuing one individual fixed point (fairness-optimum) in the weight space, we aim to find a "line" in the weight space that connects the accuracy-optimum and fairness-optimum points using a single model.

Fairness

Symbolic Mixture-of-Experts: Adaptive Skill-based Routing for Heterogeneous Reasoning

no code implementations7 Mar 2025 Justin Chih-Yao Chen, Sukwon Yun, Elias Stengel-Eskin, Tianlong Chen, Mohit Bansal

We propose a skill-based recruiting strategy that dynamically selects the most relevant set of expert LLMs for diverse reasoning tasks based on their strengths.

Math Mixture-of-Experts +1

GRNFormer: A Biologically-Guided Framework for Integrating Gene Regulatory Networks into RNA Foundation Models

no code implementations3 Mar 2025 Mufan Qiu, Xinyu Hu, Fengwei Zhan, Sukwon Yun, Jie Peng, Ruichen Zhang, Bhavya Kailkhura, Jiekun Yang, Tianlong Chen

Second, we design a structure-aware integration framework that addresses the information asymmetry in GRNs through two technical advances: (1) A graph topological adapter using multi-head cross-attention to weight regulatory relationships dynamically, and (2) a novel edge perturbation strategy that perturb GRNs with biologically-informed co-expression links to augment graph neural network training.

Drug Response Prediction Graph Neural Network

NeuroSymAD: A Neuro-Symbolic Framework for Interpretable Alzheimer's Disease Diagnosis

no code implementations1 Mar 2025 Yexiao He, Ziyao Wang, Yuning Zhang, Tingting Dan, Tianlong Chen, Guorong Wu, Ang Li

A neural network percepts brain MRI scans, while a large language model (LLM) distills medical rules to guide a symbolic system in reasoning over biomarkers and medical history.

Diagnostic Language Modeling +2

Stable-SPAM: How to Train in 4-Bit More Stably than 16-Bit Adam

1 code implementation24 Feb 2025 Tianjin Huang, Haotian Hu, Zhenyu Zhang, Gaojie Jin, Xiang Li, Li Shen, Tianlong Chen, Lu Liu, Qingsong Wen, Zhangyang Wang, Shiwei Liu

This paper comprehensively evaluates several recently proposed optimizers for 4-bit training, revealing that low-bit precision amplifies sensitivity to learning rates and often causes unstable gradient norms, leading to divergence at higher learning rates.

Proactive Privacy Amnesia for Large Language Models: Safeguarding PII with Negligible Impact on Model Utility

no code implementations24 Feb 2025 Martin Kuo, Jingyang Zhang, Jianyi Zhang, Minxue Tang, Louis DiValentin, Aolin Ding, Jingwei Sun, William Chen, Amin Hass, Tianlong Chen, Yiran Chen, Hai Li

With the rise of large language models (LLMs), increasing research has recognized their risk of leaking personally identifiable information (PII) under malicious attacks.

MedHallu: A Comprehensive Benchmark for Detecting Medical Hallucinations in Large Language Models

no code implementations20 Feb 2025 Shrey Pandit, Jiawei Xu, Junyuan Hong, Zhangyang Wang, Tianlong Chen, Kaidi Xu, Ying Ding

Our experiments show that state-of-the-art LLMs, including GPT-4o, Llama-3. 1, and the medically fine-tuned UltraMedical, struggle with this binary hallucination detection task, with the best model achieving an F1 score as low as 0. 625 for detecting "hard" category hallucinations.

Decision Making Hallucination +1

Symbiotic Cooperation for Web Agents: Harnessing Complementary Strengths of Large and Small LLMs

no code implementations11 Feb 2025 Ruichen Zhang, Mufan Qiu, Zhen Tan, Mohan Zhang, Vincent Lu, Jie Peng, Kaidi Xu, Leandro Z. Agudelo, Peter Qian, Tianlong Chen

In this paper, we propose AgentSymbiotic, an iterative framework that couples data synthesis with task-performance, yielding a "symbiotic improvement" for both large and small LLMs.

Multi-Task Learning

Dialogue is Better Than Monologue: Instructing Medical LLMs via Strategical Conversations

no code implementations29 Jan 2025 Zijie Liu, Xinyu Zhao, Jie Peng, Zhuangdi Zhu, Qingyu Chen, Xia Hu, Tianlong Chen

Current medical AI systems often fail to replicate real-world clinical reasoning, as they are predominantly trained and evaluated on static text and question-answer tasks.

Diagnostic

Layer-Level Self-Exposure and Patch: Affirmative Token Mitigation for Jailbreak Attack Defense

1 code implementation5 Jan 2025 Yang Ouyang, Hengrui Gu, Shuhang Lin, Wenyue Hua, Jie Peng, Bhavya Kailkhura, Meijun Gao, Tianlong Chen, Kaixiong Zhou

As large language models (LLMs) are increasingly deployed in diverse applications, including chatbot assistants and code generation, aligning their behavior with safety and ethical standards has become paramount.

Chatbot Code Generation

GroverGPT: A Large Language Model with 8 Billion Parameters for Quantum Searching

no code implementations30 Dec 2024 Haoran Wang, Pingzhi Li, Min Chen, Jinglei Cheng, Junyu Liu, Tianlong Chen

In this work, we explore the potential of leveraging Large Language Models (LLMs) to simulate the output of a quantum Turing machine using Grover's quantum circuits, known to provide quadratic speedups over classical counterparts.

Language Modeling Language Modelling +1

BrainMAP: Learning Multiple Activation Pathways in Brain Networks

2 code implementations23 Dec 2024 Song Wang, Zhenyu Lei, Zhen Tan, Jiaqi Ding, Xinyu Zhao, Yushun Dong, Guorong Wu, Tianlong Chen, Chen Chen, Aiying Zhang, Jundong Li

As such, conventional GNNs struggle to learn from these pathways due to the long-range dependencies of multiple pathways.

Mamba Mixture-of-Experts

Accessing the topological properties of human brain functional sub-circuits in Echo State Networks

no code implementations19 Dec 2024 Bach Nguyen, Tianlong Chen, Shu Yang, BoJian Hou, Li Shen, Duy Duong-Tran

Connectomics-based neuromorphic computing has primarily focused on embedding human brain large-scale structural connectomes (SCs), as estimated from diffusion Magnetic Resonance Imaging (dMRI) modality, to echo-state networks (ESNs).

Memorization

DLF: Disentangled-Language-Focused Multimodal Sentiment Analysis

1 code implementation16 Dec 2024 Pan Wang, Qiang Zhou, Yawen Wu, Tianlong Chen, Jingtong Hu

To address these issues, we propose a Disentangled-Language-Focused (DLF) multimodal representation learning framework, which incorporates a feature disentanglement module to separate modality-shared and modality-specific information.

Disentanglement Multimodal Sentiment Analysis

Integrating Social Determinants of Health into Knowledge Graphs: Evaluating Prediction Bias and Fairness in Healthcare

1 code implementation29 Nov 2024 Tianqi Shang, Weiqing He, Tianlong Chen, Ying Ding, Huanmei Wu, Kaixiong Zhou, Li Shen

This approach represents one of the first comprehensive investigations into fairness issues within biomedical knowledge graphs incorporating SDoH.

Fairness Knowledge Graphs +1

Towards Stabilized and Efficient Diffusion Transformers through Long-Skip-Connections with Spectral Constraints

1 code implementation26 Nov 2024 Guanjie Chen, Xinyu Zhao, Yucheng Zhou, Xiaoye Qu, Tianlong Chen, Yu Cheng

Diffusion Transformers (DiT) have emerged as a powerful architecture for image and video generation, offering superior quality and scalability.

Denoising Image Generation +1

Path-RAG: Knowledge-Guided Key Region Retrieval for Open-ended Pathology Visual Question Answering

1 code implementation26 Nov 2024 Awais Naeem, TianHao Li, Huang-Ru Liao, Jiawei Xu, Aby M. Mathew, Zehao Zhu, Zhen Tan, Ajay Kumar Jaiswal, Raffi A. Salibian, Ziniu Hu, Tianlong Chen, Ying Ding

Admitting the complexity of pathology image analysis, Path-RAG adopts a human-centered AI approach by retrieving domain knowledge using HistoCartography to select the relevant patches from pathology images.

Prognosis Question Answering +3

BPO: Towards Balanced Preference Optimization between Knowledge Breadth and Depth in Alignment

no code implementations16 Nov 2024 Sizhe Wang, Yongqi Tong, Hengyuan Zhang, Dawei Li, Xin Zhang, Tianlong Chen

Building on this, we further propose Balanced Preference Optimization (BPO), designed to dynamically augment the knowledge depth of each sample.

Informativeness

FairSkin: Fair Diffusion for Skin Disease Image Generation

no code implementations29 Oct 2024 Ruichen Zhang, Yuguang Yao, Zhen Tan, Zhiming Li, Pan Wang, Huan Liu, Jingtong Hu, Sijia Liu, Tianlong Chen

Diffusion Model (DM) has become a leading method in generating synthetic medical images, but it suffers from a critical twofold bias: (1) The quality of images generated for Caucasian individuals is significantly higher, as measured by the Frechet Inception Distance (FID).

Data Augmentation Diagnostic +2

GDeR: Safeguarding Efficiency, Balancing, and Robustness via Prototypical Graph Pruning

1 code implementation17 Oct 2024 Guibin Zhang, Haonan Dong, Yuchen Zhang, ZHIXUN LI, Dingshuo Chen, Kai Wang, Tianlong Chen, Yuxuan Liang, Dawei Cheng, Kun Wang

Training high-quality deep models necessitates vast amounts of data, resulting in overwhelming computational and memory demands.

Graph Embedding

Harnessing Your DRAM and SSD for Sustainable and Accessible LLM Inference with Mixed-Precision and Multi-level Caching

no code implementations17 Oct 2024 Jie Peng, Zhang Cao, Huaizhi Qu, Zhengyu Zhang, Chang Guo, Yanyong Zhang, Zhichao Cao, Tianlong Chen

To enhance communication efficiency, M2Cache maintains a neuron-level mixed-precision LRU cache in HBM, a larger layer-aware cache in DRAM, and a full model in SSD.

Quantization

G-Designer: Architecting Multi-agent Communication Topologies via Graph Neural Networks

no code implementations15 Oct 2024 Guibin Zhang, Yanwei Yue, Xiangguo Sun, Guancheng Wan, Miao Yu, Junfeng Fang, Kun Wang, Tianlong Chen, Dawei Cheng

Recent advancements in large language model (LLM)-based agents have demonstrated that collective intelligence can significantly surpass the capabilities of individual agents, primarily due to well-crafted inter-agent communication topologies.

HumanEval Language Modelling +2

Adapt-$\infty$: Scalable Lifelong Multimodal Instruction Tuning via Dynamic Data Selection

1 code implementation14 Oct 2024 Adyasha Maharana, Jaehong Yoon, Tianlong Chen, Mohit Bansal

This data selector samples a subset of the most important samples from each skill cluster for training.

Flex-MoE: Modeling Arbitrary Modality Combination via the Flexible Mixture-of-Experts

1 code implementation10 Oct 2024 Sukwon Yun, Inyoung Choi, Jie Peng, Yangfan Wu, Jingxuan Bao, Qiyiwen Zhang, Jiayi Xin, Qi Long, Tianlong Chen

The core idea of Flex-MoE is to first address missing modalities using a new missing modality bank that integrates observed modality combinations with the corresponding missing ones.

Mixture-of-Experts

Glider: Global and Local Instruction-Driven Expert Router

1 code implementation9 Oct 2024 Pingzhi Li, Prateek Yadav, Jaehong Yoon, Jie Peng, Yi-Lin Sung, Mohit Bansal, Tianlong Chen

Our experiments using T5-based models for T0 and FLAN tasks demonstrate that GLIDER achieves substantially improved held-in performance while maintaining strong generalization on held-out tasks.

Is C4 Dataset Optimal for Pruning? An Investigation of Calibration Data for LLM Pruning

1 code implementation9 Oct 2024 Abhinav Bandari, Lu Yin, Cheng-Yu Hsieh, Ajay Kumar Jaiswal, Tianlong Chen, Li Shen, Ranjay Krishna, Shiwei Liu

In this study, we evaluate the choice of calibration data on LLM pruning, across a wide range of datasets that are most commonly used in LLM training and evaluation, including four pertaining datasets as well as three categories of downstream tasks encompassing nine datasets.

In-Context Learning Network Pruning

PortLLM: Personalizing Evolving Large Language Models with Training-Free and Portable Model Patches

no code implementations8 Oct 2024 Rana Muhammad Shahroz Khan, Pingzhi Li, Sukwon Yun, Zhenyu Wang, Shahriar Nirjon, Chau-Wai Wong, Tianlong Chen

As large language models (LLMs) increasingly shape the AI landscape, fine-tuning pretrained models has become more popular than in the pre-LLM era for achieving optimal performance in domain-specific tasks.

GSM8K parameter-efficient fine-tuning +2

Model-GLUE: Democratized LLM Scaling for A Large Model Zoo in the Wild

1 code implementation7 Oct 2024 Xinyu Zhao, Guoheng Sun, Ruisi Cai, Yukun Zhou, Pingzhi Li, Peihao Wang, Bowen Tan, Yexiao He, Li Chen, Yi Liang, Beidi Chen, Binhang Yuan, Hongyi Wang, Ang Li, Zhangyang Wang, Tianlong Chen

As Large Language Models (LLMs) excel across tasks and specialized domains, scaling LLMs based on existing models has garnered significant attention, which faces the challenge of decreasing performance when combining disparate models.

Benchmarking Mixture-of-Experts +1

Leveraging Social Determinants of Health in Alzheimer's Research Using LLM-Augmented Literature Mining and Knowledge Graphs

1 code implementation4 Oct 2024 Tianqi Shang, Shu Yang, Weiqing He, Tianhua Zhai, Dawei Li, BoJian Hou, Tianlong Chen, Jason H. Moore, Marylyn D. Ritchie, Li Shen

Growing evidence suggests that social determinants of health (SDoH), a set of nonmedical factors, affect individuals' risks of developing Alzheimer's disease (AD) and related dementias.

Knowledge Graphs Language Modeling +3

Cut the Crap: An Economical Communication Pipeline for LLM-based Multi-Agent Systems

no code implementations3 Oct 2024 Guibin Zhang, Yanwei Yue, ZHIXUN LI, Sukwon Yun, Guancheng Wan, Kun Wang, Dawei Cheng, Jeffrey Xu Yu, Tianlong Chen

Recent advancements in large language model (LLM)-powered agents have shown that collective intelligence can significantly outperform individual capabilities, largely attributed to the meticulously designed inter-agent communication topologies.

Language Modelling Large Language Model +1

Knowledge-Driven Feature Selection and Engineering for Genotype Data with Large Language Models

1 code implementation2 Oct 2024 Joseph Lee, Shu Yang, Jae Young Baik, Xiaoxi Liu, Zhen Tan, Dawei Li, Zixuan Wen, BoJian Hou, Duy Duong-Tran, Tianlong Chen, Li Shen

Predicting phenotypes with complex genetic bases based on a small, interpretable set of variant features remains a challenging task.

feature selection

Enhancing Quantum Security over Federated Learning via Post-Quantum Cryptography

no code implementations6 Sep 2024 Pingzhi Li, Tianlong Chen, Junyu Liu

Federated learning (FL) has become one of the standard approaches for deploying machine learning models on edge devices, where private training data are distributed across clients, and a shared model is learned by aggregating locally computed updates from each client.

Federated Learning

A Survey on Model MoErging: Recycling and Routing Among Specialized Experts for Collaborative Learning

no code implementations13 Aug 2024 Prateek Yadav, Colin Raffel, Mohammed Muqeeth, Lucas Caccia, Haokun Liu, Tianlong Chen, Mohit Bansal, Leshem Choshen, Alessandro Sordoni

The availability of performant pre-trained models has led to a proliferation of fine-tuned expert models that are specialized to a particular domain or task.

Mixture-of-Experts Survey

Mew: Multiplexed Immunofluorescence Image Analysis through an Efficient Multiplex Network

1 code implementation25 Jul 2024 Sukwon Yun, Jie Peng, Alexandro E. Trevino, Chanyoung Park, Tianlong Chen

Recent advancements in graph-based approaches for multiplexed immunofluorescence (mIF) images have significantly propelled the field forward, offering deeper insights into patient-level phenotyping.

Graph Neural Network image-classification +1

(PASS) Visual Prompt Locates Good Structure Sparsity through a Recurrent HyperNetwork

no code implementations24 Jul 2024 Tianjin Huang, Fang Meng, Li Shen, Fan Liu, Yulong Pei, Mykola Pechenizkiy, Shiwei Liu, Tianlong Chen

In this paper, we investigate a charming possibility - \textit{leveraging visual prompts to capture the channel importance and derive high-quality structural sparsity}.

SDoH-GPT: Using Large Language Models to Extract Social Determinants of Health (SDoH)

no code implementations24 Jul 2024 Bernardo Consoli, Xizhi Wu, Song Wang, Xinyu Zhao, Yanshan Wang, Justin Rousseau, Tom Hartvigsen, Li Shen, Huanmei Wu, Yifan Peng, Qi Long, Tianlong Chen, Ying Ding

Extracting social determinants of health (SDoH) from unstructured medical notes depends heavily on labor-intensive annotations, which are typically task-specific, hampering reusability and limiting sharing.

Computational Efficiency Language Modeling +2

Cross-Lingual Multi-Hop Knowledge Editing -- Benchmarks, Analysis and a Simple Contrastive Learning based Approach

no code implementations14 Jul 2024 Aditi Khandelwal, Harman Singh, Hengrui Gu, Tianlong Chen, Kaixiong Zhou

We propose the Cross-Lingual Multi-Hop Knowledge Editing paradigm, for measuring and analyzing the performance of various SoTA knowledge editing techniques in a cross-lingual setup.

Contrastive Learning knowledge editing

Composable Interventions for Language Models

1 code implementation9 Jul 2024 Arinbjorn Kolbeinsson, Kyle O'Brien, Tianjin Huang, ShangHua Gao, Shiwei Liu, Jonathan Richard Schwarz, Anurag Vaidya, Faisal Mahmood, Marinka Zitnik, Tianlong Chen, Thomas Hartvigsen

Test-time interventions for language models can enhance factual accuracy, mitigate harmful outputs, and improve model efficiency without costly retraining.

knowledge editing Machine Unlearning +1

Benchmark on Drug Target Interaction Modeling from a Structure Perspective

1 code implementation4 Jul 2024 Xinnan Zhang, Jialin Wu, Junyi Xie, Tianlong Chen, Kaixiong Zhou

In view of these, we conduct a comprehensive survey and benchmark for drug-target interaction modeling from a structure perspective, via integrating tens of explicit (i. e., GNN-based) and implicit (i. e., Transformer-based) structure learning algorithms.

Benchmarking Drug Discovery

"Glue pizza and eat rocks" -- Exploiting Vulnerabilities in Retrieval-Augmented Generative Models

no code implementations26 Jun 2024 Zhen Tan, Chengshuai Zhao, Raha Moraffah, YiFan Li, Song Wang, Jundong Li, Tianlong Chen, Huan Liu

Retrieval-Augmented Generative (RAG) models enhance Large Language Models (LLMs) by integrating external knowledge bases, improving their performance in applications like fact-checking and information searching.

Fact Checking RAG +1

$\texttt{MoE-RBench}$: Towards Building Reliable Language Models with Sparse Mixture-of-Experts

1 code implementation17 Jun 2024 Guanjie Chen, Xinyu Zhao, Tianlong Chen, Yu Cheng

Motivated by the research gap and counter-intuitive phenomenon, we propose $\texttt{MoE-RBench}$, the first comprehensive assessment of SMoE reliability from three aspects: $\textit{(i)}$ safety and hallucination, $\textit{(ii)}$ resilience to adversarial attacks, and $\textit{(iii)}$ out-of-distribution robustness.

Hallucination Mixture-of-Experts

Examining Post-Training Quantization for Mixture-of-Experts: A Benchmark

1 code implementation12 Jun 2024 Pingzhi Li, Xiaolong Jin, Yu Cheng, Tianlong Chen

Large Language Models~(LLMs) have become foundational in the realm of natural language processing, demonstrating performance improvements as model sizes increase.

Benchmarking Mixture-of-Experts +2

Graph Sparsification via Mixture of Graphs

1 code implementation23 May 2024 Guibin Zhang, Xiangguo Sun, Yanwei Yue, Chonghe Jiang, Kun Wang, Tianlong Chen, Shirui Pan

Specifically, MoG incorporates multiple sparsifier experts, each characterized by unique sparsity levels and pruning criteria, and selects the appropriate experts for each node.

Graph Learning Mixture-of-Experts

DALK: Dynamic Co-Augmentation of LLMs and KG to answer Alzheimer's Disease Questions with Scientific Literature

1 code implementation8 May 2024 Dawei Li, Shu Yang, Zhen Tan, Jae Young Baik, Sukwon Yun, Joseph Lee, Aaron Chacko, BoJian Hou, Duy Duong-Tran, Ying Ding, Huan Liu, Li Shen, Tianlong Chen

With a synergized framework of LLM and KG mutually enhancing each other, we first leverage LLM to construct an evolving AD-specific knowledge graph (KG) sourced from AD-related scientific literature, and then we utilize a coarse-to-fine sampling method with a novel self-aware knowledge retrieval approach to select appropriate knowledge from the KG to augment LLM inference capabilities.

Question Answering

Q-Newton: Hybrid Quantum-Classical Scheduling for Accelerating Neural Network Training with Newton's Gradient Descent

1 code implementation30 Apr 2024 Pingzhi Li, Junyu Liu, Hanrui Wang, Tianlong Chen

Nevertheless, one of its major bottlenecks is matrix inversion, which is notably time-consuming in $O(N^3)$ time with weak scalability.

Scheduling

Facial Affective Behavior Analysis with Instruction Tuning

1 code implementation7 Apr 2024 YiFan Li, Anh Dao, Wentao Bao, Zhen Tan, Tianlong Chen, Huan Liu, Yu Kong

Our initiative on the dataset and benchmarks reveal the nature and rationale of facial affective behaviors, i. e., fine-grained facial movement, interpretability, and reasoning.

Emotion Recognition Instruction Following

FFN-SkipLLM: A Hidden Gem for Autoregressive Decoding with Adaptive Feed Forward Skipping

no code implementations5 Apr 2024 Ajay Jaiswal, Bodun Hu, Lu Yin, Yeonju Ro, Shiwei Liu, Tianlong Chen, Aditya Akella

In this work, we observed the saturation of computationally expensive feed-forward blocks of LLM layers and proposed FFN-SkipLLM, which is a novel fine-grained skip strategy of autoregressive LLMs.

Attribute Hallucination +1

Rethinking Pruning for Vision-Language Models: Strategies for Effective Sparsity and Performance Restoration

2 code implementations3 Apr 2024 Shwai He, Ang Li, Tianlong Chen

This study addresses two key questions: how to distribute sparsity across different modality-specific models, and how to restore the performance of pruned sparse VLMs.

Knowledge Distillation

Thought Graph: Generating Thought Process for Biological Reasoning

no code implementations11 Mar 2024 Chi-Yang Hsu, Kyle Cox, Jiawei Xu, Zhen Tan, Tianhua Zhai, Mengzhou Hu, Dexter Pratt, Tianlong Chen, Ziniu Hu, Ying Ding

We present the Thought Graph as a novel framework to support complex reasoning and use gene set analysis as an example to uncover semantic relationships between biological processes.

Tuning-Free Accountable Intervention for LLM Deployment -- A Metacognitive Approach

no code implementations8 Mar 2024 Zhen Tan, Jie Peng, Tianlong Chen, Huan Liu

Large Language Models (LLMs) have catalyzed transformative advances across a spectrum of natural language processing tasks through few-shot or zero-shot prompting, bypassing the need for parameter tuning.

Decision Making Hallucination

Privacy-preserving Fine-tuning of Large Language Models through Flatness

no code implementations7 Mar 2024 Tiejin Chen, Longchao Da, Huixue Zhou, Pingzhi Li, Kaixiong Zhou, Tianlong Chen, Hua Wei

The privacy concerns associated with the use of Large Language Models (LLMs) have grown recently with the development of LLMs such as ChatGPT.

Knowledge Distillation Privacy Preserving +3

GraphRCG: Self-Conditioned Graph Generation

no code implementations2 Mar 2024 Song Wang, Zhen Tan, Xinyu Zhao, Tianlong Chen, Huan Liu, Jundong Li

In contrast, in this work, we propose a novel self-conditioned graph generation framework designed to explicitly model graph distributions and employ these distributions to guide the generation process.

Graph Generation

Word-Sequence Entropy: Towards Uncertainty Estimation in Free-Form Medical Question Answering Applications and Beyond

no code implementations22 Feb 2024 Zhiyuan Wang, Jinhao Duan, Chenxi Yuan, Qingyu Chen, Tianlong Chen, Yue Zhang, Ren Wang, Xiaoshuang Shi, Kaidi Xu

Uncertainty estimation is crucial for the reliability of safety-critical human and artificial intelligence (AI) interaction systems, particularly in the domain of healthcare engineering.

Form MedQA +2

Take the Bull by the Horns: Hard Sample-Reweighted Continual Training Improves LLM Generalization

1 code implementation22 Feb 2024 Xuxi Chen, Zhendong Wang, Daouda Sow, Junjie Yang, Tianlong Chen, Yingbin Liang, Mingyuan Zhou, Zhangyang Wang

Our study starts from an empirical strategy for the light continual training of LLMs using their original pre-training data sets, with a specific focus on selective retention of samples that incur moderately high losses.

The Wolf Within: Covert Injection of Malice into MLLM Societies via an MLLM Operative

1 code implementation20 Feb 2024 Zhen Tan, Chengshuai Zhao, Raha Moraffah, YiFan Li, Yu Kong, Tianlong Chen, Huan Liu

Unlike direct harmful output generation for MLLMs, our research demonstrates how a single MLLM agent can be subtly influenced to generate prompts that, in turn, induce other MLLM agents in the society to output malicious content.

Misinformation

Revisiting Zeroth-Order Optimization for Memory-Efficient LLM Fine-Tuning: A Benchmark

1 code implementation18 Feb 2024 Yihua Zhang, Pingzhi Li, Junyuan Hong, Jiaxiang Li, Yimeng Zhang, Wenqing Zheng, Pin-Yu Chen, Jason D. Lee, Wotao Yin, Mingyi Hong, Zhangyang Wang, Sijia Liu, Tianlong Chen

In the evolving landscape of natural language processing (NLP), fine-tuning pre-trained Large Language Models (LLMs) with first-order (FO) optimizers like SGD and Adam has become standard.

Benchmarking

Two Heads Are Better Than One: Boosting Graph Sparse Training via Semantic and Topological Awareness

no code implementations2 Feb 2024 Guibin Zhang, Yanwei Yue, Kun Wang, Junfeng Fang, Yongduo Sui, Kai Wang, Yuxuan Liang, Dawei Cheng, Shirui Pan, Tianlong Chen

Specifically, GST initially constructs a topology & semantic anchor at a low training cost, followed by performing dynamic sparse training to align the sparse graph with the anchor.

Adversarial Defense Graph Learning

Contextualization Distillation from Large Language Model for Knowledge Graph Completion

1 code implementation28 Jan 2024 Dawei Li, Zhen Tan, Tianlong Chen, Huan Liu

While textual information significantly enhances the performance of pre-trained language models (PLMs) in knowledge graph completion (KGC), the static and noisy nature of existing corpora collected from Wikipedia articles or synsets definitions often limits the potential of PLM-based KGC models.

Language Modeling Language Modelling +1

QuantumSEA: In-Time Sparse Exploration for Noise Adaptive Quantum Circuits

1 code implementation10 Jan 2024 Tianlong Chen, Zhenyu Zhang, Hanrui Wang, Jiaqi Gu, Zirui Li, David Z. Pan, Frederic T. Chong, Song Han, Zhangyang Wang

To address these two pain points, we propose QuantumSEA, an in-time sparse exploration for noise-adaptive quantum circuits, aiming to achieve two key objectives: (1) implicit circuits capacity during training - by dynamically exploring the circuit's sparse connectivity and sticking a fixed small number of quantum gates throughout the training which satisfies the coherence time and enjoy light noises, enabling feasible executions on real quantum devices; (2) noise robustness - by jointly optimizing the topology and parameters of quantum circuits under real device noise models.

Quantum Machine Learning

Molecular Data Programming: Towards Molecule Pseudo-labeling with Systematic Weak Supervision

no code implementations CVPR 2024 Xin Juan, Kaixiong Zhou, Ninghao Liu, Tianlong Chen, Xin Wang

The premise for the great advancement of molecular machine learning is dependent on a considerable amount of labeled data.

Sparsity-Guided Holistic Explanation for LLMs with Interpretable Inference-Time Intervention

2 code implementations22 Dec 2023 Zhen Tan, Tianlong Chen, Zhenyu Zhang, Huan Liu

Large Language Models (LLMs) have achieved unprecedented breakthroughs in various natural language processing domains.

The Counterattack of CNNs in Self-Supervised Learning: Larger Kernel Size might be All You Need

no code implementations9 Dec 2023 Tianjin Huang, Tianlong Chen, Zhangyang Wang, Shiwei Liu

Therefore, it remains unclear whether the self-attention operation is crucial for the recent advances in SSL - or CNNs can deliver the same excellence with more advanced designs, too?

All Self-Supervised Learning

Rethinking PGD Attack: Is Sign Function Necessary?

1 code implementation3 Dec 2023 Junjie Yang, Tianlong Chen, Xuxi Chen, Zhangyang Wang, Yingbin Liang

Based on that, we further propose a new raw gradient descent (RGD) algorithm that eliminates the use of sign.

Visual Prompting Upgrades Neural Network Sparsification: A Data-Model Perspective

1 code implementation3 Dec 2023 Can Jin, Tianjin Huang, Yihua Zhang, Mykola Pechenizkiy, Sijia Liu, Shiwei Liu, Tianlong Chen

The rapid development of large-scale deep learning models questions the affordability of hardware platforms, which necessitates the pruning to reduce their computational and memory footprints.

Image Classification Visual Prompting

TFMQ-DM: Temporal Feature Maintenance Quantization for Diffusion Models

1 code implementation CVPR 2024 Yushi Huang, Ruihao Gong, Jing Liu, Tianlong Chen, Xianglong Liu

Remarkably, our quantization approach, for the first time, achieves model performance nearly on par with the full-precision model under 4-bit weight quantization.

Denoising Image Generation +1

Merge, Then Compress: Demystify Efficient SMoE with Hints from Its Routing Policy

1 code implementation2 Oct 2023 Pingzhi Li, Zhenyu Zhang, Prateek Yadav, Yi-Lin Sung, Yu Cheng, Mohit Bansal, Tianlong Chen

Sparsely activated Mixture-of-Experts (SMoE) has shown promise to scale up the learning capacity of neural networks, however, they have issues like (a) High Memory Usage, due to duplication of the network layers into multiple copies as experts; and (b) Redundancy in Experts, as common learning-based routing policies suffer from representational collapse.

Mixture-of-Experts

Robust Mixture-of-Expert Training for Convolutional Neural Networks

1 code implementation ICCV 2023 Yihua Zhang, Ruisi Cai, Tianlong Chen, Guanhua Zhang, huan zhang, Pin-Yu Chen, Shiyu Chang, Zhangyang Wang, Sijia Liu

Since the lack of robustness has become one of the main hurdles for CNNs, in this paper we ask: How to adversarially robustify a CNN-based MoE model?

Adversarial Robustness

Enhancing Adversarial Training via Reweighting Optimization Trajectory

1 code implementation25 Jun 2023 Tianjin Huang, Shiwei Liu, Tianlong Chen, Meng Fang, Li Shen, Vlaod Menkovski, Lu Yin, Yulong Pei, Mykola Pechenizkiy

Despite the fact that adversarial training has become the de facto method for improving the robustness of deep neural networks, it is well-known that vanilla adversarial training suffers from daunting robust overfitting, resulting in unsatisfactory robust generalization.

Adversarial Robustness

H$_2$O: Heavy-Hitter Oracle for Efficient Generative Inference of Large Language Models

2 code implementations24 Jun 2023 Zhenyu Zhang, Ying Sheng, Tianyi Zhou, Tianlong Chen, Lianmin Zheng, Ruisi Cai, Zhao Song, Yuandong Tian, Christopher Ré, Clark Barrett, Zhangyang Wang, Beidi Chen

Based on these insights, we propose Heavy Hitter Oracle (H$_2$O), a KV cache eviction policy that dynamically retains a balance of recent and H$_2$ tokens.

Graph Ladling: Shockingly Simple Parallel GNN Training without Intermediate Communication

1 code implementation18 Jun 2023 Ajay Jaiswal, Shiwei Liu, Tianlong Chen, Ying Ding, Zhangyang Wang

By dividing giant graph data, we build multiple independently and parallelly trained weaker GNNs (soup ingredient) without any intermediate communication, and combine their strength using a greedy interpolation soup procedure to achieve state-of-the-art performance.

graph partitioning Graph Sampling

Instant Soup: Cheap Pruning Ensembles in A Single Pass Can Draw Lottery Tickets from Large Models

1 code implementation18 Jun 2023 Ajay Jaiswal, Shiwei Liu, Tianlong Chen, Ying Ding, Zhangyang Wang

Motivated by the recent observations of model soups, which suggest that fine-tuned weights of multiple models can be merged to a better minima, we propose Instant Soup Pruning (ISP) to generate lottery ticket quality subnetworks, using a fraction of the original IMP cost by replacing the expensive intermediate pruning stages of IMP with computationally efficient weak mask generation and aggregation routine.

The Emergence of Essential Sparsity in Large Pre-trained Models: The Weights that Matter

1 code implementation NeurIPS 2023 Ajay Jaiswal, Shiwei Liu, Tianlong Chen, Zhangyang Wang

Large pre-trained transformers are show-stealer in modern-day deep learning, and it becomes crucial to comprehend the parsimonious patterns that exist within them as they grow in scale.

Self-Supervised Learning

Sparsity May Cry: Let Us Fail (Current) Sparse Neural Networks Together!

1 code implementation3 Mar 2023 Shiwei Liu, Tianlong Chen, Zhenyu Zhang, Xuxi Chen, Tianjin Huang, Ajay Jaiswal, Zhangyang Wang

In pursuit of a more general evaluation and unveiling the true potential of sparse algorithms, we introduce "Sparsity May Cry" Benchmark (SMC-Bench), a collection of carefully-curated 4 diverse tasks with 10 datasets, that accounts for capturing a wide range of domain-specific and sophisticated knowledge.

Sparse MoE as the New Dropout: Scaling Dense and Self-Slimmable Transformers

1 code implementation2 Mar 2023 Tianlong Chen, Zhenyu Zhang, Ajay Jaiswal, Shiwei Liu, Zhangyang Wang

Despite their remarkable achievement, gigantic transformers encounter significant drawbacks, including exorbitant computational and memory footprints during training, as well as severe collapse evidenced by a high degree of parameter redundancy.

Mixture-of-Experts

M-L2O: Towards Generalizable Learning-to-Optimize by Test-Time Fast Self-Adaptation

1 code implementation28 Feb 2023 Junjie Yang, Xuxi Chen, Tianlong Chen, Zhangyang Wang, Yingbin Liang

This data-driven procedure yields L2O that can efficiently solve problems similar to those seen in training, that is, drawn from the same ``task distribution".

Learning to Generalize Provably in Learning to Optimize

1 code implementation22 Feb 2023 Junjie Yang, Tianlong Chen, Mingkang Zhu, Fengxiang He, DaCheng Tao, Yingbin Liang, Zhangyang Wang

While the optimizer generalization has been recently studied, the optimizee generalization (or learning to generalize) has not been rigorously studied in the L2O context, which is the aim of this paper.

AdaMV-MoE: Adaptive Multi-Task Vision Mixture-of-Experts

1 code implementation ICCV 2023 Tianlong Chen, Xuxi Chen, Xianzhi Du, Abdullah Rashwan, Fan Yang, Huizhong Chen, Zhangyang Wang, Yeqing Li

Instead of compressing multiple tasks' knowledge into a single model, MoE separates the parameter space and only utilizes the relevant model pieces given task type and its input, which provides stabilized MTL training and ultra-efficient inference.

Instance Segmentation Mixture-of-Experts +4

You Can Have Better Graph Neural Networks by Not Training Weights at All: Finding Untrained GNNs Tickets

1 code implementation28 Nov 2022 Tianjin Huang, Tianlong Chen, Meng Fang, Vlado Menkovski, Jiaxu Zhao, Lu Yin, Yulong Pei, Decebal Constantin Mocanu, Zhangyang Wang, Mykola Pechenizkiy, Shiwei Liu

Recent works have impressively demonstrated that there exists a subnetwork in randomly initialized convolutional neural networks (CNNs) that can match the performance of the fully trained dense networks at initialization, without any optimization of the weights of the network (i. e., untrained networks).

All Out-of-Distribution Detection

Peeling the Onion: Hierarchical Reduction of Data Redundancy for Efficient Vision Transformer Training

1 code implementation19 Nov 2022 Zhenglun Kong, Haoyu Ma, Geng Yuan, Mengshu Sun, Yanyue Xie, Peiyan Dong, Xin Meng, Xuan Shen, Hao Tang, Minghai Qin, Tianlong Chen, Xiaolong Ma, Xiaohui Xie, Zhangyang Wang, Yanzhi Wang

Vision transformers (ViTs) have recently obtained success in many applications, but their intensive computation and heavy memory usage at both training and inference time limit their generalization.

QuanGCN: Noise-Adaptive Training for Robust Quantum Graph Convolutional Networks

no code implementations9 Nov 2022 Kaixiong Zhou, Zhenyu Zhang, Shengyuan Chen, Tianlong Chen, Xiao Huang, Zhangyang Wang, Xia Hu

Quantum neural networks (QNNs), an interdisciplinary field of quantum computing and machine learning, have attracted tremendous research interests due to the specific quantum advantages.

M$^3$ViT: Mixture-of-Experts Vision Transformer for Efficient Multi-task Learning with Model-Accelerator Co-design

1 code implementation26 Oct 2022 Hanxue Liang, Zhiwen Fan, Rishov Sarkar, Ziyu Jiang, Tianlong Chen, Kai Zou, Yu Cheng, Cong Hao, Zhangyang Wang

However, when deploying MTL onto those real-world systems that are often resource-constrained or latency-sensitive, two prominent challenges arise: (i) during training, simultaneously optimizing all tasks is often difficult due to gradient conflicts across tasks; (ii) at inference, current MTL regimes have to activate nearly the entire model even to just execute a single task.

Mixture-of-Experts Multi-Task Learning

Old can be Gold: Better Gradient Flow can Make Vanilla-GCNs Great Again

1 code implementation14 Oct 2022 Ajay Jaiswal, Peihao Wang, Tianlong Chen, Justin F. Rousseau, Ying Ding, Zhangyang Wang

In this paper, firstly, we provide a new perspective of gradient flow to understand the substandard performance of deep GCNs and hypothesize that by facilitating healthy gradient flow, we can significantly improve their trainability, as well as achieve state-of-the-art (SOTA) level performance from vanilla-GCNs.

Advancing Model Pruning via Bi-level Optimization

1 code implementation8 Oct 2022 Yihua Zhang, Yuguang Yao, Parikshit Ram, Pu Zhao, Tianlong Chen, Mingyi Hong, Yanzhi Wang, Sijia Liu

To reduce the computation overhead, various efficient 'one-shot' pruning methods have been developed, but these schemes are usually unable to find winning tickets as good as IMP.

model

Augmentations in Hypergraph Contrastive Learning: Fabricated and Generative

1 code implementation7 Oct 2022 Tianxin Wei, Yuning You, Tianlong Chen, Yang shen, Jingrui He, Zhangyang Wang

This paper targets at improving the generalizability of hypergraph neural networks in the low-label regime, through applying the contrastive learning approach from images/graphs (we refer to it as HyperGCL).

Contrastive Learning Fairness +2

Can We Solve 3D Vision Tasks Starting from A 2D Vision Transformer?

2 code implementations15 Sep 2022 Yi Wang, Zhiwen Fan, Tianlong Chen, Hehe Fan, Zhangyang Wang

Vision Transformers (ViTs) have proven to be effective, in solving 2D image understanding tasks by training over large-scale image datasets; and meanwhile as a somehow separate track, in modeling the 3D visual world too such as voxels or point clouds.

Point Cloud Segmentation

Is Attention All That NeRF Needs?

1 code implementation27 Jul 2022 Mukund Varma T, Peihao Wang, Xuxi Chen, Tianlong Chen, Subhashini Venugopalan, Zhangyang Wang

While prior works on NeRFs optimize a scene representation by inverting a handcrafted rendering equation, GNT achieves neural representation and rendering that generalizes across scenes using transformers at two stages.

All Generalizable Novel View Synthesis +3

Neural Implicit Dictionary via Mixture-of-Expert Training

1 code implementation8 Jul 2022 Peihao Wang, Zhiwen Fan, Tianlong Chen, Zhangyang Wang

In this paper, we present a generic INR framework that achieves both data and training efficiency by learning a Neural Implicit Dictionary (NID) from a data collection and representing INR as a functional combination of basis sampled from the dictionary.

Image Inpainting

Aug-NeRF: Training Stronger Neural Radiance Fields with Triple-Level Physically-Grounded Augmentations

1 code implementation CVPR 2022 Tianlong Chen, Peihao Wang, Zhiwen Fan, Zhangyang Wang

Inspired by that, we propose Augmented NeRF (Aug-NeRF), which for the first time brings the power of robust data augmentations into regularizing the NeRF training.

NeRF Novel View Synthesis +1

Training Your Sparse Neural Network Better with Any Mask

1 code implementation26 Jun 2022 Ajay Jaiswal, Haoyu Ma, Tianlong Chen, Ying Ding, Zhangyang Wang

Pruning large neural networks to create high-quality, independently trainable sparse masks, which can maintain similar performance to their dense counterparts, is very desirable due to the reduced space and time complexity.

Can pruning improve certified robustness of neural networks?

1 code implementation15 Jun 2022 Zhangheng Li, Tianlong Chen, Linyi Li, Bo Li, Zhangyang Wang

Given the fact that neural networks are often over-parameterized, one effective way to reduce such computational overhead is neural network pruning, by removing redundant parameters from trained neural networks.

Network Pruning

Linearity Grafting: Relaxed Neuron Pruning Helps Certifiable Robustness

1 code implementation15 Jun 2022 Tianlong Chen, huan zhang, Zhenyu Zhang, Shiyu Chang, Sijia Liu, Pin-Yu Chen, Zhangyang Wang

Certifiable robustness is a highly desirable property for adopting deep neural networks (DNNs) in safety-critical scenarios, but often demands tedious computations to establish.

Queried Unlabeled Data Improves and Robustifies Class-Incremental Learning

no code implementations15 Jun 2022 Tianlong Chen, Sijia Liu, Shiyu Chang, Lisa Amini, Zhangyang Wang

Inspired by the recent success of learning robust models with unlabeled data, we explore a new robustness-aware CIL setting, where the learned adversarial robustness has to resist forgetting and be transferred as new tasks come in continually.

Adversarial Robustness class-incremental learning +2

Data-Efficient Double-Win Lottery Tickets from Robust Pre-training

1 code implementation9 Jun 2022 Tianlong Chen, Zhenyu Zhang, Sijia Liu, Yang Zhang, Shiyu Chang, Zhangyang Wang

For example, on downstream CIFAR-10/100 datasets, we identify double-win matching subnetworks with the standard, fast adversarial, and adversarial pre-training from ImageNet, at 89. 26%/73. 79%, 89. 26%/79. 03%, and 91. 41%/83. 22% sparsity, respectively.

Transfer Learning

Quarantine: Sparsity Can Uncover the Trojan Attack Trigger for Free

1 code implementation CVPR 2022 Tianlong Chen, Zhenyu Zhang, Yihua Zhang, Shiyu Chang, Sijia Liu, Zhangyang Wang

Trojan attacks threaten deep neural networks (DNNs) by poisoning them to behave normally on most samples, yet to produce manipulated results for inputs attached with a particular trigger.

Network Pruning

APP: Anytime Progressive Pruning

1 code implementation4 Apr 2022 Diganta Misra, Bharat Runwal, Tianlong Chen, Zhangyang Wang, Irina Rish

With the latest advances in deep learning, there has been a lot of focus on the online learning paradigm due to its relevance in practical settings.

Network Pruning Sparse Learning

Symbolic Learning to Optimize: Towards Interpretability and Scalability

1 code implementation ICLR 2022 Wenqing Zheng, Tianlong Chen, Ting-Kuei Hu, Zhangyang Wang

Recent studies on Learning to Optimize (L2O) suggest a promising path to automating and accelerating the optimization procedure for complicated tasks.

Symbolic Regression

Optimizer Amalgamation

1 code implementation ICLR 2022 Tianshu Huang, Tianlong Chen, Sijia Liu, Shiyu Chang, Lisa Amini, Zhangyang Wang

Selecting an appropriate optimizer for a given problem is of major interest for researchers and practitioners.

The Principle of Diversity: Training Stronger Vision Transformers Calls for Reducing All Levels of Redundancy

1 code implementation CVPR 2022 Tianlong Chen, Zhenyu Zhang, Yu Cheng, Ahmed Awadallah, Zhangyang Wang

However, a "head-to-toe assessment" regarding the extent of redundancy in ViTs, and how much we could gain by thoroughly mitigating such, has been absent for this field.

All Diversity

Anti-Oversmoothing in Deep Vision Transformers via the Fourier Domain Analysis: From Theory to Practice

1 code implementation9 Mar 2022 Peihao Wang, Wenqing Zheng, Tianlong Chen, Zhangyang Wang

The first technique, termed AttnScale, decomposes a self-attention block into low-pass and high-pass components, then rescales and combines these two filters to produce an all-pass self-attention matrix.

Don't Be So Dense: Sparse-to-Sparse GAN Training Without Sacrificing Performance

no code implementations5 Mar 2022 Shiwei Liu, Yuesong Tian, Tianlong Chen, Li Shen

Even more unconventionally, our proposed method enables directly training sparse unbalanced GANs with an extremely sparse generator from scratch.

Model Compression

Sparsity Winning Twice: Better Robust Generalization from More Efficient Training

1 code implementation ICLR 2022 Tianlong Chen, Zhenyu Zhang, Pengjun Wang, Santosh Balachandra, Haoyu Ma, Zehao Wang, Zhangyang Wang

We introduce two alternatives for sparse adversarial training: (i) static sparsity, by leveraging recent results from the lottery ticket hypothesis to identify critical sparse subnetworks arising from the early training; (ii) dynamic sparsity, by allowing the sparse subnetwork to adaptively adjust its connectivity pattern (while sticking to the same sparsity ratio) throughout training.

Coarsening the Granularity: Towards Structurally Sparse Lottery Tickets

1 code implementation9 Feb 2022 Tianlong Chen, Xuxi Chen, Xiaolong Ma, Yanzhi Wang, Zhangyang Wang

The lottery ticket hypothesis (LTH) has shown that dense models contain highly sparse subnetworks (i. e., winning tickets) that can be trained in isolation to match full accuracy.

The Unreasonable Effectiveness of Random Pruning: Return of the Most Naive Baseline for Sparse Training

1 code implementation ICLR 2022 Shiwei Liu, Tianlong Chen, Xiaohan Chen, Li Shen, Decebal Constantin Mocanu, Zhangyang Wang, Mykola Pechenizkiy

In this paper, we focus on sparse training and highlight a perhaps counter-intuitive finding, that random pruning at initialization can be quite powerful for the sparse training of modern neural networks.

Adversarial Robustness Out-of-Distribution Detection

VAQF: Fully Automatic Software-Hardware Co-Design Framework for Low-Bit Vision Transformer

no code implementations17 Jan 2022 Mengshu Sun, Haoyu Ma, Guoliang Kang, Yifan Jiang, Tianlong Chen, Xiaolong Ma, Zhangyang Wang, Yanzhi Wang

To the best of our knowledge, this is the first time quantization has been incorporated into ViT acceleration on FPGAs with the help of a fully automatic framework to guide the quantization strategy on the software side and the accelerator implementations on the hardware side given the target frame rate.

High-Level Synthesis Quantization

Bringing Your Own View: Graph Contrastive Learning without Prefabricated Data Augmentations

1 code implementation4 Jan 2022 Yuning You, Tianlong Chen, Zhangyang Wang, Yang shen

Accordingly, we have extended the prefabricated discrete prior in the augmentation set, to a learnable continuous prior in the parameter space of graph generators, assuming that graph priors per se, similar to the concept of image manifolds, can be learned by data generation.

Contrastive Learning Graph Learning

CADTransformer: Panoptic Symbol Spotting Transformer for CAD Drawings

1 code implementation CVPR 2022 Zhiwen Fan, Tianlong Chen, Peihao Wang, Zhangyang Wang

CADTransformer tokenizes directly from the set of graphical primitives in CAD drawings, and correspondingly optimizes line-grained semantic and instance symbol spotting altogether by a pair of prediction heads.

Data Augmentation

Improving Contrastive Learning on Imbalanced Seed Data via Open-World Sampling

1 code implementation NeurIPS 2021 Ziyu Jiang, Tianlong Chen, Ting Chen, Zhangyang Wang

Contrastive learning approaches have achieved great success in learning visual representations with few labels of the target classes.

Attribute Contrastive Learning +1

You are caught stealing my winning lottery ticket! Making a lottery ticket claim its ownership

1 code implementation NeurIPS 2021 Xuxi Chen, Tianlong Chen, Zhenyu Zhang, Zhangyang Wang

The lottery ticket hypothesis (LTH) emerges as a promising framework to leverage a special sparse subnetwork (i. e., winning ticket) instead of a full model for both training and inference, that can lower both costs without sacrificing the performance.

DSEE: Dually Sparsity-embedded Efficient Tuning of Pre-trained Language Models

1 code implementation30 Oct 2021 Xuxi Chen, Tianlong Chen, Weizhu Chen, Ahmed Hassan Awadallah, Zhangyang Wang, Yu Cheng

To address these pain points, we propose a framework for resource- and parameter-efficient fine-tuning by leveraging the sparsity prior in both weight updates and the final model weights.

parameter-efficient fine-tuning

CAP: Co-Adversarial Perturbation on Weights and Features for Improving Generalization of Graph Neural Networks

no code implementations28 Oct 2021 Haotian Xue, Kaixiong Zhou, Tianlong Chen, Kai Guo, Xia Hu, Yi Chang, Xin Wang

In this paper, we investigate GNNs from the lens of weight and feature loss landscapes, i. e., the loss changes with respect to model weights and node features, respectively.

Towards Lifelong Learning of Multilingual Text-To-Speech Synthesis

1 code implementation9 Oct 2021 Mu Yang, Shaojin Ding, Tianlong Chen, Tong Wang, Zhangyang Wang

This work presents a lifelong learning approach to train a multilingual Text-To-Speech (TTS) system, where each language was seen as an individual task and was learned sequentially and continually.

Lifelong learning Speech Synthesis +3

Universality of Winning Tickets: A Renormalization Group Perspective

no code implementations7 Oct 2021 William T. Redman, Tianlong Chen, Zhangyang Wang, Akshunna S. Dogra

Foundational work on the Lottery Ticket Hypothesis has suggested an exciting corollary: winning tickets found in the context of one task can be transferred to similar tasks, possibly even across different architectures.

Inductive Lottery Ticket Learning for Graph Neural Networks

no code implementations29 Sep 2021 Yongduo Sui, Xiang Wang, Tianlong Chen, Xiangnan He, Tat-Seng Chua

In this work, we propose a simple and effective learning paradigm, Inductive Co-Pruning of GNNs (ICPG), to endow graph lottery tickets with inductive pruning capacity.

Graph Classification Node Classification +1

Learning Pruning-Friendly Networks via Frank-Wolfe: One-Shot, Any-Sparsity, And No Retraining

1 code implementation ICLR 2022 Lu Miao, Xiaolong Luo, Tianlong Chen, Wuyang Chen, Dong Liu, Zhangyang Wang

Conventional methods often require (iterative) pruning followed by re-training, which not only incurs large overhead beyond the original DNN training but also can be sensitive to retraining hyperparameters.

Universality of Deep Neural Network Lottery Tickets: A Renormalization Group Perspective

no code implementations29 Sep 2021 William T Redman, Tianlong Chen, Akshunna S. Dogra, Zhangyang Wang

Foundational work on the Lottery Ticket Hypothesis has suggested an exciting corollary: winning tickets found in the context of one task can be transferred to similar tasks, possibly even across different architectures.

Audio Lottery: Speech Recognition Made Ultra-Lightweight, Noise-Robust, and Transferable

no code implementations ICLR 2022 Shaojin Ding, Tianlong Chen, Zhangyang Wang

In this paper, we investigate the tantalizing possibility of using lottery ticket hypothesis to discover lightweight speech recognition models, that are (1) robust to various noise existing in speech; (2) transferable to fit the open-world personalization; and 3) compatible with structured sparsity.

speech-recognition Speech Recognition

AutoCoG: A Unified Data-Modal Co-Search Framework for Graph Neural Networks

no code implementations29 Sep 2021 Duc N.M Hoang, Kaixiong Zhou, Tianlong Chen, Xia Hu, Zhangyang Wang

Despite the preliminary success, we argue that for GNNs, NAS has to be customized further, due to the topological complicacy of GNN input data (graph) as well as the notorious training instability.

Data Augmentation Language Modeling +2

Lottery Tickets can have Structural Sparsity

no code implementations29 Sep 2021 Tianlong Chen, Xuxi Chen, Xiaolong Ma, Yanzhi Wang, Zhangyang Wang

The lottery ticket hypothesis (LTH) has shown that dense models contain highly sparse subnetworks (i. e., $\textit{winning tickets}$) that can be trained in isolation to match full accuracy.

Scaling the Depth of Vision Transformers via the Fourier Domain Analysis

no code implementations ICLR 2022 Peihao Wang, Wenqing Zheng, Tianlong Chen, Zhangyang Wang

The first technique, termed AttnScale, decomposes a self-attention block into low-pass and high-pass components, then rescales and combines these two filters to produce an all-pass self-attention matrix.

Generalizable Learning to Optimize into Wide Valleys

no code implementations29 Sep 2021 Junjie Yang, Tianlong Chen, Mingkang Zhu, Fengxiang He, DaCheng Tao, Yingbin Liang, Zhangyang Wang

Learning to optimize (L2O) has gained increasing popularity in various optimization tasks, since classical optimizers usually require laborious, problem-specific design and hyperparameter tuning.

Stingy Teacher: Sparse Logits Suffice to Fail Knowledge Distillation

no code implementations29 Sep 2021 Haoyu Ma, Yifan Huang, Tianlong Chen, Hao Tang, Chenyu You, Zhangyang Wang, Xiaohui Xie

However, it is unclear why the distorted distribution of the logits is catastrophic to the student model.

Knowledge Distillation

Sparse Unbalanced GAN Training with In-Time Over-Parameterization

no code implementations29 Sep 2021 Shiwei Liu, Yuesong Tian, Tianlong Chen, Li Shen

Perhaps most importantly, we find instead of inheriting parameters from expensive pre-trained GANs, directly training sparse GANs from scratch can be a much more efficient solution.

Model Compression

Bag of Tricks for Training Deeper Graph Neural Networks: A Comprehensive Benchmark Study

1 code implementation24 Aug 2021 Tianlong Chen, Kaixiong Zhou, Keyu Duan, Wenqing Zheng, Peihao Wang, Xia Hu, Zhangyang Wang

In view of those, we present the first fair and reproducible benchmark dedicated to assessing the "tricks" of training deep GNNs.

DANCE: DAta-Network Co-optimization for Efficient Segmentation Model Training and Inference

no code implementations16 Jul 2021 Chaojian Li, Wuyang Chen, Yuchen Gu, Tianlong Chen, Yonggan Fu, Zhangyang Wang, Yingyan Celine Lin

Semantic segmentation for scene understanding is nowadays widely demanded, raising significant challenges for the algorithm efficiency, especially its applications on resource-limited platforms.

Scene Understanding Segmentation +1

Sanity Checks for Lottery Tickets: Does Your Winning Ticket Really Win the Jackpot?

2 code implementations NeurIPS 2021 Xiaolong Ma, Geng Yuan, Xuan Shen, Tianlong Chen, Xuxi Chen, Xiaohan Chen, Ning Liu, Minghai Qin, Sijia Liu, Zhangyang Wang, Yanzhi Wang

Based on our analysis, we summarize a guideline for parameter settings in regards of specific architecture characteristics, which we hope to catalyze the research progress on the topic of lottery ticket hypothesis.

Scalable Perception-Action-Communication Loops with Convolutional and Graph Neural Networks

1 code implementation24 Jun 2021 Ting-Kuei Hu, Fernando Gama, Tianlong Chen, Wenqing Zheng, Zhangyang Wang, Alejandro Ribeiro, Brian M. Sadler

Our framework is implemented by a cascade of a convolutional and a graph neural network (CNN / GNN), addressing agent-level visual perception and feature learning, as well as swarm-level communication, local information aggregation and agent action inference, respectively.

Graph Neural Network Imitation Learning

Sparse Training via Boosting Pruning Plasticity with Neuroregeneration

2 code implementations NeurIPS 2021 Shiwei Liu, Tianlong Chen, Xiaohan Chen, Zahra Atashgahi, Lu Yin, Huanyu Kou, Li Shen, Mykola Pechenizkiy, Zhangyang Wang, Decebal Constantin Mocanu

Works on lottery ticket hypothesis (LTH) and single-shot network pruning (SNIP) have raised a lot of attention currently on post-training pruning (iterative magnitude pruning), and before-training pruning (pruning at initialization).

Network Pruning Sparse Learning

Sparse and Imperceptible Adversarial Attack via a Homotopy Algorithm

1 code implementation10 Jun 2021 Mingkang Zhu, Tianlong Chen, Zhangyang Wang

Compared to state-of-the-art methods, our homotopy attack leads to significantly fewer perturbations, e. g., reducing 42. 91% on CIFAR-10 and 75. 03% on ImageNet (average case, targeted attack), at similar maximal perturbation magnitudes, when still achieving 100% attack success rates.

Adversarial Attack

Graph Contrastive Learning Automated

2 code implementations10 Jun 2021 Yuning You, Tianlong Chen, Yang shen, Zhangyang Wang

Unfortunately, unlike its counterpart on image data, the effectiveness of GraphCL hinges on ad-hoc data augmentations, which have to be manually picked per dataset, by either rules of thumb or trial-and-errors, owing to the diverse nature of graph data.

Contrastive Learning Representation Learning +1

Chasing Sparsity in Vision Transformers: An End-to-End Exploration

1 code implementation NeurIPS 2021 Tianlong Chen, Yu Cheng, Zhe Gan, Lu Yuan, Lei Zhang, Zhangyang Wang

For example, our sparsified DeiT-Small at (5%, 50%) sparsity for (data, architecture), improves 0. 28% top-1 accuracy, and meanwhile enjoys 49. 32% FLOPs and 4. 40% running time savings.

Efficient ViTs

Efficient Lottery Ticket Finding: Less Data is More

1 code implementation6 Jun 2021 Zhenyu Zhang, Xuxi Chen, Tianlong Chen, Zhangyang Wang

We observe that a high-quality winning ticket can be found with training and pruning the dense network on the very compact PrAC set, which can substantially save training iterations for the ticket finding process.

Self-Damaging Contrastive Learning

1 code implementation6 Jun 2021 Ziyu Jiang, Tianlong Chen, Bobak Mortazavi, Zhangyang Wang

Hence, the key innovation in SDCLR is to create a dynamic self-competitor model to contrast with the target model, which is a pruned version of the latter.

Contrastive Learning Linear evaluation +2

GANs Can Play Lottery Tickets Too

1 code implementation ICLR 2021 Xuxi Chen, Zhenyu Zhang, Yongduo Sui, Tianlong Chen

In this work, we for the first time study the existence of such trainable matching subnetworks in deep GANs.

Image-to-Image Translation

Improving Contrastive Learning on Imbalanced Data via Open-World Sampling

1 code implementation NeurIPS 2021 Ziyu Jiang, Tianlong Chen, Ting Chen, Zhangyang Wang

Contrastive learning approaches have achieved great success in learning visual representations with few labels of the target classes.

Attribute Contrastive Learning +1

Undistillable: Making A Nasty Teacher That CANNOT teach students

1 code implementation ICLR 2021 Haoyu Ma, Tianlong Chen, Ting-Kuei Hu, Chenyu You, Xiaohui Xie, Zhangyang Wang

Knowledge Distillation (KD) is a widely used technique to transfer knowledge from pre-trained teacher models to (usually more lightweight) student models.

Knowledge Distillation

Troubleshooting Blind Image Quality Models in the Wild

no code implementations CVPR 2021 Zhihua Wang, Haotao Wang, Tianlong Chen, Zhangyang Wang, Kede Ma

Recently, the group maximum differentiation competition (gMAD) has been used to improve blind image quality assessment (BIQA) models, with the help of full-reference metrics.

Blind Image Quality Assessment Network Pruning

Learning Transferable 3D Adversarial Cloaks for Deep Trained Detectors

1 code implementation22 Apr 2021 Arman Maesumi, Mingkang Zhu, Yi Wang, Tianlong Chen, Zhangyang Wang, Chandrajit Bajaj

This paper presents a novel patch-based adversarial attack pipeline that trains adversarial patches on 3D human meshes.

Adversarial Attack Object

"BNN - BN = ?": Training Binary Neural Networks without Batch Normalization

1 code implementation16 Apr 2021 Tianlong Chen, Zhenyu Zhang, Xu Ouyang, Zechun Liu, Zhiqiang Shen, Zhangyang Wang

However, the BN layer is costly to calculate and is typically implemented with non-binary parameters, leaving a hurdle for the efficient implementation of BNN training.

Image Classification

Learning to Optimize: A Primer and A Benchmark

1 code implementation23 Mar 2021 Tianlong Chen, Xiaohan Chen, Wuyang Chen, Howard Heaton, Jialin Liu, Zhangyang Wang, Wotao Yin

It automates the design of an optimization method based on its performance on a set of training problems.

Benchmarking

Adversarial Feature Augmentation and Normalization for Visual Recognition

1 code implementation22 Mar 2021 Tianlong Chen, Yu Cheng, Zhe Gan, JianFeng Wang, Lijuan Wang, Zhangyang Wang, Jingjing Liu

Recent advances in computer vision take advantage of adversarial data augmentation to ameliorate the generalization ability of classification models.

Classification Data Augmentation +2

Data-Efficient GAN Training Beyond (Just) Augmentations: A Lottery Ticket Perspective

1 code implementation NeurIPS 2021 Tianlong Chen, Yu Cheng, Zhe Gan, Jingjing Liu, Zhangyang Wang

Training generative adversarial networks (GANs) with limited real image data generally results in deteriorated performance and collapsed models.

Data Augmentation

A Unified Lottery Ticket Hypothesis for Graph Neural Networks

2 code implementations12 Feb 2021 Tianlong Chen, Yongduo Sui, Xuxi Chen, Aston Zhang, Zhangyang Wang

With graphs rapidly growing in size and deeper graph neural networks (GNNs) emerging, the training and inference of GNNs become increasingly expensive.

Link Prediction Node Classification

Spending Your Winning Lottery Better After Drawing It

1 code implementation8 Jan 2021 Ajay Kumar Jaiswal, Haoyu Ma, Tianlong Chen, Ying Ding, Zhangyang Wang

In this paper, we demonstrate that it is unnecessary for spare retraining to strictly inherit those properties from the dense network.

Knowledge Distillation

Bayesian Learning to Optimize: Quantifying the Optimizer Uncertainty

no code implementations1 Jan 2021 Yue Cao, Tianlong Chen, Zhangyang Wang, Yang shen

Optimizing an objective function with uncertainty awareness is well-known to improve the accuracy and confidence of optimization solutions.

image-classification Image Classification +2

Learning A Minimax Optimizer: A Pilot Study

no code implementations ICLR 2021 Jiayi Shen, Xiaohan Chen, Howard Heaton, Tianlong Chen, Jialin Liu, Wotao Yin, Zhangyang Wang

We first present Twin L2O, the first dedicated minimax L2O framework consisting of two LSTMs for updating min and max variables, respectively.

ALFA: Adversarial Feature Augmentation for Enhanced Image Recognition

no code implementations1 Jan 2021 Tianlong Chen, Yu Cheng, Zhe Gan, Yu Hu, Zhangyang Wang, Jingjing Liu

Adversarial training is an effective method to combat adversarial attacks in order to create robust neural networks.

Efficiently Troubleshooting Image Segmentation Models with Human-In-The-Loop

no code implementations1 Jan 2021 Haotao Wang, Tianlong Chen, Zhangyang Wang, Kede Ma

Image segmentation lays the foundation for many high-stakes vision applications such as autonomous driving and medical image analysis.

Autonomous Driving Image Segmentation +3

Robust Overfitting may be mitigated by properly learned smoothening

no code implementations ICLR 2021 Tianlong Chen, Zhenyu Zhang, Sijia Liu, Shiyu Chang, Zhangyang Wang

A recent study (Rice et al., 2020) revealed overfitting to be a dominant phenomenon in adversarially robust training of deep networks, and that appropriate early-stopping of adversarial training (AT) could match the performance gains of most recent algorithmic improvements.

Knowledge Distillation

The Lottery Tickets Hypothesis for Supervised and Self-supervised Pre-training in Computer Vision Models

1 code implementation CVPR 2021 Tianlong Chen, Jonathan Frankle, Shiyu Chang, Sijia Liu, Yang Zhang, Michael Carbin, Zhangyang Wang

We extend the scope of LTH and question whether matching subnetworks still exist in pre-trained computer vision models, that enjoy the same downstream transfer performance.

Robust Pre-Training by Adversarial Contrastive Learning

1 code implementation NeurIPS 2020 Ziyu Jiang, Tianlong Chen, Ting Chen, Zhangyang Wang

Recent work has shown that, when integrated with adversarial training, self-supervised pre-training can lead to state-of-the-art robustness In this work, we improve robustness-aware self-supervised pre-training by learning representations that are consistent under both data augmentations and adversarial perturbations.

Adversarial Robustness Contrastive Learning

Graph Contrastive Learning with Augmentations

4 code implementations NeurIPS 2020 Yuning You, Tianlong Chen, Yongduo Sui, Ting Chen, Zhangyang Wang, Yang shen

In this paper, we propose a graph contrastive learning (GraphCL) framework for learning unsupervised representations of graph data.

Contrastive Learning Representation Learning +2

Training Stronger Baselines for Learning to Optimize

1 code implementation NeurIPS 2020 Tianlong Chen, Weiyi Zhang, Jingyang Zhou, Shiyu Chang, Sijia Liu, Lisa Amini, Zhangyang Wang

Learning to optimize (L2O) has gained increasing attention since classical optimizers require laborious problem-specific design and hyperparameter tuning.

Imitation Learning Rolling Shutter Correction

PCAL: A Privacy-preserving Intelligent Credit Risk Modeling Framework Based on Adversarial Learning

no code implementations6 Oct 2020 Yuli Zheng, Zhenyu Wu, Ye Yuan, Tianlong Chen, Zhangyang Wang

While machine learning is increasingly used in this field, the resulting large-scale collection of user private information has reinvigorated the privacy debate, considering dozens of data breach incidents every year caused by unauthorized hackers, and (potentially even more) information misuse/abuse by authorized parties.

BIG-bench Machine Learning Privacy Preserving

Can 3D Adversarial Logos Cloak Humans?

1 code implementation25 Jun 2020 Yi Wang, Jingyang Zhou, Tianlong Chen, Sijia Liu, Shiyu Chang, Chandrajit Bajaj, Zhangyang Wang

Contrary to the traditional adversarial patch, this new form of attack is mapped into the 3D object world and back-propagates to the 2D image domain through differentiable rendering.

Object

Self-PU: Self Boosted and Calibrated Positive-Unlabeled Training

1 code implementation ICML 2020 Xuxi Chen, Wuyang Chen, Tianlong Chen, Ye Yuan, Chen Gong, Kewei Chen, Zhangyang Wang

Many real-world applications have to tackle the Positive-Unlabeled (PU) learning problem, i. e., learning binary classifiers from a large amount of unlabeled data and a few labeled positive examples.

Cannot find the paper you are looking for? You can Submit a new open access paper.