Search Results for author: Sheng Zha

Found 19 papers, 9 papers with code

Pre-training Differentially Private Models with Limited Public Data

no code implementations • 28 Feb 2024 • Zhiqi Bu, Xinwei Zhang, Mingyi Hong, Sheng Zha, George Karypis

The superior performance of large foundation models relies on the use of massive amounts of high-quality data, which often contain sensitive, private and copyrighted material that requires formal protection.

Paper
Add Code

Extreme Miscalibration and the Illusion of Adversarial Robustness

no code implementations • 27 Feb 2024 • Vyas Raina, Samson Tan, Volkan Cevher, Aditya Rawal, Sheng Zha, George Karypis

Deep learning-based Natural Language Processing (NLP) models are vulnerable to adversarial attacks, where small perturbations can cause a model to misclassify.

Adversarial Attack Adversarial Robustness

Paper
Add Code

Zero redundancy distributed learning with differential privacy

no code implementations • 20 Nov 2023 • Zhiqi Bu, Justin Chiu, Ruixuan Liu, Sheng Zha, George Karypis

Deep learning using large models have achieved great success in a wide range of domains.

Privacy Preserving

Paper
Add Code

On the accuracy and efficiency of group-wise clipping in differentially private optimization

no code implementations • 30 Oct 2023 • Zhiqi Bu, Ruixuan Liu, Yu-Xiang Wang, Sheng Zha, George Karypis

Recent advances have substantially improved the accuracy, memory cost, and training speed of differentially private (DP) deep learning, especially on large vision and language models with millions to billions of parameters.

Paper
Add Code

Efficient Long-Range Transformers: You Need to Attend More, but Not Necessarily at Every Layer

no code implementations • 19 Oct 2023 • Qingru Zhang, Dhananjay Ram, Cole Hawkins, Sheng Zha, Tuo Zhao

These models leverage the attention mechanism to capture long- and short-range dependencies in the sequence.

8k Computational Efficiency +1

Paper
Add Code

Coupling public and private gradient provably helps optimization

no code implementations • 2 Oct 2023 • Ruixuan Liu, Zhiqi Bu, Yu-Xiang Wang, Sheng Zha, George Karypis

The success of large neural networks is crucially determined by the availability of data.

Paper
Add Code

HYTREL: Hypergraph-enhanced Tabular Data Representation Learning

1 code implementation • NeurIPS 2023 • Pei Chen, Soumajyoti Sarkar, Leonard Lausen, Balasubramaniam Srinivasan, Sheng Zha, Ruihong Huang, George Karypis

Language models pretrained on large collections of tabular data have demonstrated their effectiveness in several downstream tasks.

Language Modelling Representation Learning

Paper
Code

Large Language Models of Code Fail at Completing Code with Potential Bugs

1 code implementation • NeurIPS 2023 • Tuan Dinh, Jinman Zhao, Samson Tan, Renato Negrinho, Leonard Lausen, Sheng Zha, George Karypis

We find that the presence of potential bugs significantly degrades the generation performance of the high-performing Code-LLMs.

Code Completion

Paper
Code

Better Context Makes Better Code Language Models: A Case Study on Function Call Argument Completion

no code implementations • 1 Jun 2023 • Hengzhi Pei, Jinman Zhao, Leonard Lausen, Sheng Zha, George Karypis

To better solve this task, we query a program analyzer for information relevant to a given function call, and consider ways to provide the analyzer results to different code completion models during inference and training.

Code Completion Program Synthesis

Paper
Add Code

Parameter and Data Efficient Continual Pre-training for Robustness to Dialectal Variance in Arabic

no code implementations • 8 Nov 2022 • Soumajyoti Sarkar, Kaixiang Lin, Sailik Sengupta, Leonard Lausen, Sheng Zha, Saab Mansour

While prior research studies have tried to adapt these multilingual models for dialectal variants of Arabic, it still remains a challenging problem owing to the lack of sufficient monolingual dialectal data and parallel translation data of such dialectal variants.

Avg Language Modelling +1

Paper
Add Code

Differentially Private Optimization on Large Model at Small Cost

2 code implementations • 30 Sep 2022 • Zhiqi Bu, Yu-Xiang Wang, Sheng Zha, George Karypis

Our implementation achieves state-of-the-art (SOTA) accuracy with very small extra cost: on GPT2 and at almost the same memory cost (<1% overhead), BK has 1. 03X the time complexity of the standard training (0. 83X training speed in practice), and 0. 61X the time complexity of the most efficient DP implementation (1. 36X training speed in practice).

Privacy Preserving

Paper
Code

Differentially Private Bias-Term only Fine-tuning of Foundation Models

1 code implementation • 30 Sep 2022 • Zhiqi Bu, Yu-Xiang Wang, Sheng Zha, George Karypis

We study the problem of differentially private (DP) fine-tuning of large pre-trained models -- a recent privacy-preserving approach suitable for solving downstream tasks with sensitive data.

Privacy Preserving

Paper
Code

Exploring the Role of Task Transferability in Large-Scale Multi-Task Learning

no code implementations • NAACL 2022 • Vishakh Padmakumar, Leonard Lausen, Miguel Ballesteros, Sheng Zha, He He, George Karypis

Recent work has found that multi-task training with a large number of diverse tasks can uniformly improve downstream performance on unseen target tasks.

Multi-Task Learning Representation Learning

Paper
Add Code

Meta-learning via Language Model In-context Tuning

1 code implementation • ACL 2022 • Yanda Chen, Ruiqi Zhong, Sheng Zha, George Karypis, He He

The goal of meta-learning is to learn to adapt to a new task with only a few labeled examples.

In-Context Learning Inductive Bias +4

Paper
Code

Accelerated Large Batch Optimization of BERT Pretraining in 54 minutes

1 code implementation • 24 Jun 2020 • Shuai Zheng, Haibin Lin, Sheng Zha, Mu Li

Using the proposed LANS method and the learning rate scheme, we scaled up the mini-batch sizes to 96K and 33K in phases 1 and 2 of BERT pretraining, respectively.

Natural Language Understanding

Paper
Code

Unlearn Dataset Bias in Natural Language Inference by Fitting the Residual

1 code implementation • WS 2019 • He He, Sheng Zha, Haohan Wang

We first learn a biased model that only uses features that are known to relate to dataset bias.

Natural Language Inference Negation

Paper
Code

GluonCV and GluonNLP: Deep Learning in Computer Vision and Natural Language Processing

4 code implementations • 9 Jul 2019 • Jian Guo, He He, Tong He, Leonard Lausen, Mu Li, Haibin Lin, Xingjian Shi, Chenguang Wang, Junyuan Xie, Sheng Zha, Aston Zhang, Hang Zhang, Zhi Zhang, Zhongyue Zhang, Shuai Zheng, Yi Zhu

We present GluonCV and GluonNLP, the deep learning toolkits for computer vision and natural language processing based on Apache MXNet (incubating).

2,548

Paper
Code

Dynamic Mini-batch SGD for Elastic Distributed Training: Learning in the Limbo of Resources

2 code implementations • 26 Apr 2019 • Haibin Lin, Hang Zhang, Yifei Ma, Tong He, Zhi Zhang, Sheng Zha, Mu Li

One difficulty we observe is that the noise in the stochastic momentum estimation is accumulated over time and will have delayed effects when the batch size changes.

Image Classification object-detection +3

Paper
Code

Question Type Guided Attention in Visual Question Answering

no code implementations • ECCV 2018 • Yang Shi, Tommaso Furlanello, Sheng Zha, Animashree Anandkumar

Visual Question Answering (VQA) requires integration of feature maps with drastically different structures and focus of the correct regions.

Activity Recognition Question Answering +2

Paper
Add Code

Cannot find the paper you are looking for? You can Submit a new open access paper.