Search Results for author: Tianhao Wang

Found 57 papers, 25 papers with code

Unveiling Induction Heads: Provable Training Dynamics and Feature Learning in Transformers

no code implementations9 Sep 2024 Siyu Chen, Heejune Sheen, Tianhao Wang, Zhuoran Yang

In the limiting model, the first attention layer acts as a $\mathit{copier}$, copying past tokens within a given window to each position, and the feed-forward network with normalization acts as a $\mathit{selector}$ that generates a feature vector by only looking at informationally relevant parents from the window.

In-Context Learning Large Language Model +1

Towards Certified Unlearning for Deep Neural Networks

1 code implementation1 Aug 2024 Binchi Zhang, Yushun Dong, Tianhao Wang, Jundong Li

In the field of machine unlearning, certified unlearning has been extensively studied in convex machine learning models due to its high efficiency and strong theoretical guarantees.

Machine Unlearning

Towards Understanding Unsafe Video Generation

1 code implementation17 Jul 2024 Yan Pang, Aiping Xiong, Yang Zhang, Tianhao Wang

With the labeled information and the corresponding prompts, we created the first dataset of unsafe videos generated by VGMs.

Image Generation Video Generation

Learning Interpretable Fair Representations

no code implementations24 Jun 2024 Tianhao Wang, Zana Buçinca, Zilin Ma

Numerous approaches have been recently proposed for learning fair representations that mitigate unfair outcomes in prediction tasks.

Representation Learning

TrajDeleter: Enabling Trajectory Forgetting in Offline Reinforcement Learning Agents

1 code implementation18 Apr 2024 Chen Gong, Kecen Li, Jin Yao, Tianhao Wang

While this new paradigm presents remarkable effectiveness across various real-world domains, like healthcare and energy management, there is a growing demand to enable agents to rapidly and completely eliminate the influence of specific trajectories from both the training dataset and the trained agents.

energy management Offline RL +3

Implicit Regularization of Gradient Flow on One-Layer Softmax Attention

no code implementations13 Mar 2024 Heejune Sheen, Siyu Chen, Tianhao Wang, Harrison H. Zhou

Under a separability assumption on the data, we show that when gradient flow achieves the minimal loss value, it further implicitly minimizes the nuclear norm of the product of the key and query weight matrices.

How Well Can Transformers Emulate In-context Newton's Method?

no code implementations5 Mar 2024 Angeliki Giannou, Liu Yang, Tianhao Wang, Dimitris Papailiopoulos, Jason D. Lee

Recent studies have suggested that Transformers can implement first-order optimization algorithms for in-context learning and even second order ones for the case of linear regression.

In-Context Learning regression

Training Dynamics of Multi-Head Softmax Attention for In-Context Learning: Emergence, Convergence, and Optimality

no code implementations29 Feb 2024 Siyu Chen, Heejune Sheen, Tianhao Wang, Zhuoran Yang

In addition, we prove that an interesting "task allocation" phenomenon emerges during the gradient flow dynamics, where each attention head focuses on solving a single task of the multi-task model.

In-Context Learning

Machine Unlearning of Pre-trained Large Language Models

1 code implementation23 Feb 2024 Jin Yao, Eli Chien, Minxin Du, Xinyao Niu, Tianhao Wang, Zezhou Cheng, Xiang Yue

This study investigates the concept of the `right to be forgotten' within the context of large language models (LLMs).

Machine Unlearning

VGMShield: Mitigating Misuse of Video Generative Models

1 code implementation20 Feb 2024 Yan Pang, Yang Zhang, Tianhao Wang

Together with fake video detection and tracing, our multi-faceted set of solutions can effectively mitigate misuse of video generative models.

Video Generation

Revisiting Differentially Private Hyper-parameter Tuning

no code implementations20 Feb 2024 Zihang Xiang, Tianhao Wang, Chenglong Wang, Di Wang

Recent works propose a generic private selection solution for the tuning process, yet a fundamental question persists: is this privacy bound tight?

A Somewhat Robust Image Watermark against Diffusion-based Editing Models

1 code implementation22 Nov 2023 Mingtian Tan, Tianhao Wang, Somesh Jha

In response, we develop a novel technique, RIW (Robust Invisible Watermarking), to embed invisible watermarks leveraging adversarial example techniques.

Image Generation

Preserving Node-level Privacy in Graph Neural Networks

no code implementations12 Nov 2023 Zihang Xiang, Tianhao Wang, Di Wang

In this study, we propose a solution that specifically addresses the issue of node-level privacy.

PrivImage: Differentially Private Synthetic Image Generation using Diffusion Models with Semantic-Aware Pretraining

1 code implementation19 Oct 2023 Kecen Li, Chen Gong, Zhixiang Li, Yuzhong Zhao, Xinwen Hou, Tianhao Wang

Then, this function assists in querying the semantic distribution of the sensitive dataset, facilitating the selection of data from the public dataset with analogous semantics for pre-training.

Image Generation

Last One Standing: A Comparative Analysis of Security and Privacy of Soft Prompt Tuning, LoRA, and In-Context Learning

no code implementations17 Oct 2023 Rui Wen, Tianhao Wang, Michael Backes, Yang Zhang, Ahmed Salem

Large Language Models (LLMs) are powerful tools for natural language processing, enabling novel applications and user experiences.

In-Context Learning

The Marginal Value of Momentum for Small Learning Rate SGD

no code implementations27 Jul 2023 Runzhe Wang, Sadhika Malladi, Tianhao Wang, Kaifeng Lyu, Zhiyuan Li

Momentum is known to accelerate the convergence of gradient descent in strongly convex settings without stochastic gradient noise.

Stochastic Optimization

Pareto-Secure Machine Learning (PSML): Fingerprinting and Securing Inference Serving Systems

no code implementations3 Jul 2023 Debopam Sanyal, Jui-Tse Hung, Manav Agrawal, Prahlad Jasti, Shahab Nikkhoo, Somesh Jha, Tianhao Wang, Sibin Mohan, Alexey Tumanov

Second, we counter the proposed attack with a noise-based defense mechanism that thwarts fingerprinting by adding noise to the specified performance metrics.

Model extraction

Differentially Private Wireless Federated Learning Using Orthogonal Sequences

no code implementations14 Jun 2023 Xizixiang Wei, Tianhao Wang, Ruiquan Huang, Cong Shen, Jing Yang, H. Vincent Poor

A new FL convergence bound is derived which, combined with the privacy guarantees, allows for a smooth tradeoff between the achieved convergence rate and differential privacy levels.

Federated Learning Privacy Preserving

Interpreting GNN-based IDS Detections Using Provenance Graph Structural Features

no code implementations1 Jun 2023 Kunal Mukherjee, Joshua Wiedemeier, Tianhao Wang, Muhyun Kim, Feng Chen, Murat Kantarcioglu, Kangkook Jee

PROVEXPLAINER allowed simple DT models to achieve 95% fidelity to the GNN on program classification tasks with general graph structural features, and 99% fidelity on malware detection tasks with a task-specific feature package tailored for direct interpretation.

Decision Making Descriptive +3

Cooperative Multi-Agent Reinforcement Learning: Asynchronous Communication and Linear Function Approximation

no code implementations10 May 2023 Yifei Min, Jiafan He, Tianhao Wang, Quanquan Gu

We study multi-agent reinforcement learning in the setting of episodic Markov decision processes, where multiple agents cooperate via communication through a central server.

Multi-agent Reinforcement Learning reinforcement-learning

Neural Lumped Parameter Differential Equations with Application in Friction-Stir Processing

no code implementations18 Apr 2023 James Koch, Woongjo Choi, Ethan King, David Garcia, Hrishikesh Das, Tianhao Wang, Ken Ross, Keerti Kappagantula

Lumped parameter methods aim to simplify the evolution of spatially-extended or continuous physical systems to that of a "lumped" element representative of the physical scales of the modeled system.

Friction

Practical Differentially Private and Byzantine-resilient Federated Learning

1 code implementation15 Apr 2023 Zihang Xiang, Tianhao Wang, WanYu Lin, Di Wang

In contrast, we leverage the random noise to construct an aggregation that effectively rejects many existing Byzantine attacks.

Federated Learning Privacy Preserving

FACE-AUDITOR: Data Auditing in Facial Recognition Systems

2 code implementations5 Apr 2023 Min Chen, Zhikun Zhang, Tianhao Wang, Michael Backes, Yang Zhang

Few-shot-based facial recognition systems have gained increasing attention due to their scalability and ability to work with a few face images during the model deployment phase.

A Plot is Worth a Thousand Words: Model Information Stealing Attacks via Scientific Plots

1 code implementation23 Feb 2023 Boyang Zhang, Xinlei He, Yun Shen, Tianhao Wang, Yang Zhang

Given the simplicity and effectiveness of the attack method, our study indicates scientific plots indeed constitute a valid side channel for model information stealing attacks.

valid

BAFFLE: Hiding Backdoors in Offline Reinforcement Learning Datasets

1 code implementation7 Oct 2022 Chen Gong, Zhou Yang, Yunpeng Bai, Junda He, Jieke Shi, Kecen Li, Arunesh Sinha, Bowen Xu, Xinwen Hou, David Lo, Tianhao Wang

Our experiments conducted on four tasks and four offline RL algorithms expose a disquieting fact: none of the existing offline RL algorithms is immune to such a backdoor attack.

Autonomous Driving Backdoor Attack +4

Federated Boosted Decision Trees with Differential Privacy

2 code implementations6 Oct 2022 Samuel Maddock, Graham Cormode, Tianhao Wang, Carsten Maple, Somesh Jha

There is great demand for scalable, secure, and efficient privacy-preserving machine learning models that can be trained over distributed data.

Privacy Preserving

Differentially Private Vertical Federated Clustering

2 code implementations2 Aug 2022 Zitao Li, Tianhao Wang, Ninghui Li

To enable model learning while protecting the privacy of the data subjects, we need vertical federated learning (VFL) techniques, where the data parties share only information for training the model, instead of the private data.

Clustering Vertical Federated Learning

Just Rotate it: Deploying Backdoor Attacks via Rotation Transformation

no code implementations22 Jul 2022 Tong Wu, Tianhao Wang, Vikash Sehwag, Saeed Mahloujifar, Prateek Mittal

Our attack can be easily deployed in the real world since it only requires rotating the object, as we show in both image classification and object detection applications.

Data Augmentation Image Classification +3

Implicit Bias of Gradient Descent on Reparametrized Models: On Equivalence to Mirror Descent

no code implementations8 Jul 2022 Zhiyuan Li, Tianhao Wang, JasonD. Lee, Sanjeev Arora

Conversely, continuous mirror descent with any Legendre function can be viewed as gradient flow with a related commuting parametrization.

A Simple and Provably Efficient Algorithm for Asynchronous Federated Contextual Linear Bandits

no code implementations7 Jul 2022 Jiafan He, Tianhao Wang, Yifei Min, Quanquan Gu

To the best of our knowledge, this is the first provably efficient algorithm that allows fully asynchronous communication for federated contextual linear bandits, while achieving the same regret guarantee as in the single-agent setting.

Memorization in NLP Fine-tuning Methods

1 code implementation25 May 2022 FatemehSadat Mireshghallah, Archit Uniyal, Tianhao Wang, David Evans, Taylor Berg-Kirkpatrick

Large language models are shown to present privacy risks through memorization of training data, and several recent works have studied such risks for the pre-training phase.

Memorization

Learning Stochastic Shortest Path with Linear Function Approximation

no code implementations25 Oct 2021 Yifei Min, Jiafan He, Tianhao Wang, Quanquan Gu

To the best of our knowledge, this is the first algorithm with a sublinear regret guarantee for learning linear mixture SSP.

What Happens after SGD Reaches Zero Loss? --A Mathematical Framework

no code implementations ICLR 2022 Zhiyuan Li, Tianhao Wang, Sanjeev Arora

Understanding the implicit bias of Stochastic Gradient Descent (SGD) is one of the key challenges in deep learning, especially for overparametrized models, where the local minimizers of the loss function $L$ can form a manifold.

valid

Towards General Robustness to Bad Training Data

no code implementations29 Sep 2021 Tianhao Wang, Yi Zeng, Ming Jin, Ruoxi Jia

In this paper, we focus on the problem of identifying bad training data when the underlying cause is unknown in advance.

Data Summarization

Zero-Round Active Learning

no code implementations14 Jul 2021 Si Chen, Tianhao Wang, Ruoxi Jia

Our algorithm does not rely on any feedback from annotators in the target domain and hence, can be used to perform zero-round active learning or warm-start existing multi-round active learning strategies.

Active Learning Domain Adaptation

Improving Cooperative Game Theory-based Data Valuation via Data Utility Learning

1 code implementation13 Jul 2021 Tianhao Wang, Yu Yang, Ruoxi Jia

The Shapley value (SV) and Least core (LC) are classic methods in cooperative game theory for cost/profit sharing problems.

Active Learning Data Valuation

Variance-Aware Off-Policy Evaluation with Linear Function Approximation

no code implementations NeurIPS 2021 Yifei Min, Tianhao Wang, Dongruo Zhou, Quanquan Gu

We study the off-policy evaluation (OPE) problem in reinforcement learning with linear function approximation, which aims to estimate the value function of a target policy based on the offline data collected by a behavior policy.

Off-policy evaluation

A Unified Framework for Task-Driven Data Quality Management

no code implementations10 Jun 2021 Tianhao Wang, Yi Zeng, Ming Jin, Ruoxi Jia

High-quality data is critical to train performant Machine Learning (ML) models, highlighting the importance of Data Quality Management (DQM).

Data Summarization Data Valuation +1

Differential Privacy for Text Analytics via Natural Text Sanitization

1 code implementation Findings (ACL) 2021 Xiang Yue, Minxin Du, Tianhao Wang, Yaliang Li, Huan Sun, Sherman S. M. Chow

The sanitized texts also contribute to our sanitization-aware pretraining and fine-tuning, enabling privacy-preserving natural language processing over the BERT language model with promising utility.

Language Modelling Privacy Preserving

One-Round Active Learning

no code implementations23 Apr 2021 Tianhao Wang, Si Chen, Ruoxi Jia

In this work, we initiate the study of one-round active learning, which aims to select a subset of unlabeled data points that achieve the highest model performance after being labeled with only the information from initially labeled data points.

Active Learning

Graph Unlearning

1 code implementation27 Mar 2021 Min Chen, Zhikun Zhang, Tianhao Wang, Michael Backes, Mathias Humbert, Yang Zhang

In this paper, we propose GraphEraser, a novel machine unlearning framework tailored to graph data.

Machine Unlearning

Provably Efficient Reinforcement Learning with Linear Function Approximation Under Adaptivity Constraints

no code implementations NeurIPS 2021 Tianhao Wang, Dongruo Zhou, Quanquan Gu

In specific, for the batch learning model, our proposed LSVI-UCB-Batch algorithm achieves an $\tilde O(\sqrt{d^3H^3T} + dHT/B)$ regret, where $d$ is the dimension of the feature mapping, $H$ is the episode length, $T$ is the number of interactions and $B$ is the number of batches.

reinforcement-learning Reinforcement Learning (RL)

PURE: A Framework for Analyzing Proximity-based Contact Tracing Protocols

no code implementations17 Dec 2020 Fabrizio Cicala, Weicheng Wang, Tianhao Wang, Ninghui Li, Elisa Bertino, Faming Liang, Yang Yang

Many proximity-based tracing (PCT) protocols have been proposed and deployed to combat the spreading of COVID-19.

Computers and Society C.3; H.4; J.3; J.7; K.4; K.6.5

A Principled Approach to Data Valuation for Federated Learning

no code implementations14 Sep 2020 Tianhao Wang, Johannes Rausch, Ce Zhang, Ruoxi Jia, Dawn Song

The federated SV preserves the desirable properties of the canonical SV while it can be calculated without incurring extra communication cost and is also able to capture the effect of participation order on data value.

Data Summarization Data Valuation +1

Improving Robustness to Model Inversion Attacks via Mutual Information Regularization

2 code implementations11 Sep 2020 Tianhao Wang, Yuheng Zhang, Ruoxi Jia

This paper studies defense mechanisms against model inversion (MI) attacks -- a type of privacy attacks aimed at inferring information about the training data distribution given the access to a target machine learning model.

When Machine Unlearning Jeopardizes Privacy

1 code implementation5 May 2020 Min Chen, Zhikun Zhang, Tianhao Wang, Michael Backes, Mathias Humbert, Yang Zhang

More importantly, we show that our attack in multiple cases outperforms the classical membership inference attack on the original ML model, which indicates that machine unlearning can have counterproductive effects on privacy.

Inference Attack Machine Unlearning +1

Estimating Numerical Distributions under Local Differential Privacy

2 code implementations2 Dec 2019 Zitao Li, Tianhao Wang, Milan Lopuhaä-Zwakenberg, Boris Skoric, Ninghui Li

When collecting information, local differential privacy (LDP) relieves the concern of privacy leakage from users' perspective, as user's private information is randomized before sent to the aggregator.

RIGA: Covert and Robust White-Box Watermarking of Deep Neural Networks

1 code implementation31 Oct 2019 Tianhao Wang, Florian Kerschbaum

White-box watermarking algorithms have the advantage that they do not impact the accuracy of the watermarked model.

Inference Attack

Improving Utility and Security of the Shuffler-based Differential Privacy

1 code implementation30 Aug 2019 Tianhao Wang, Bolin Ding, Min Xu, Zhicong Huang, Cheng Hong, Jingren Zhou, Ninghui Li, Somesh Jha

When collecting information, local differential privacy (LDP) alleviates privacy concerns of users because their private information is randomized before being sent it to the central aggregator.

Locally Differentially Private Frequency Estimation with Consistency

1 code implementation20 May 2019 Tianhao Wang, Milan Lopuhaä-Zwakenberg, Zitao Li, Boris Skoric, Ninghui Li

In this paper, we show that adding post-processing steps to FO protocols by exploiting the knowledge that all individual frequencies should be non-negative and they sum up to one can lead to significantly better accuracy for a wide range of tasks, including frequencies of individual values, frequencies of the most frequent values, and frequencies of subsets of values.

Continuous and Discrete-time Accelerated Stochastic Mirror Descent for Strongly Convex Functions

no code implementations ICML 2018 Pan Xu, Tianhao Wang, Quanquan Gu

We provide a second-order stochastic differential equation (SDE), which characterizes the continuous-time dynamics of accelerated stochastic mirror descent (ASMD) for strongly convex functions.

Stochastic Optimization

Cannot find the paper you are looking for? You can Submit a new open access paper.