Search Results for author: Sheng Zha

Found 19 papers, 9 papers with code

Pre-training Differentially Private Models with Limited Public Data

no code implementations28 Feb 2024 Zhiqi Bu, Xinwei Zhang, Mingyi Hong, Sheng Zha, George Karypis

The superior performance of large foundation models relies on the use of massive amounts of high-quality data, which often contain sensitive, private and copyrighted material that requires formal protection.

Extreme Miscalibration and the Illusion of Adversarial Robustness

no code implementations27 Feb 2024 Vyas Raina, Samson Tan, Volkan Cevher, Aditya Rawal, Sheng Zha, George Karypis

Deep learning-based Natural Language Processing (NLP) models are vulnerable to adversarial attacks, where small perturbations can cause a model to misclassify.

Adversarial Attack Adversarial Robustness

Zero redundancy distributed learning with differential privacy

no code implementations20 Nov 2023 Zhiqi Bu, Justin Chiu, Ruixuan Liu, Sheng Zha, George Karypis

Deep learning using large models have achieved great success in a wide range of domains.

Privacy Preserving

On the accuracy and efficiency of group-wise clipping in differentially private optimization

no code implementations30 Oct 2023 Zhiqi Bu, Ruixuan Liu, Yu-Xiang Wang, Sheng Zha, George Karypis

Recent advances have substantially improved the accuracy, memory cost, and training speed of differentially private (DP) deep learning, especially on large vision and language models with millions to billions of parameters.

Coupling public and private gradient provably helps optimization

no code implementations2 Oct 2023 Ruixuan Liu, Zhiqi Bu, Yu-Xiang Wang, Sheng Zha, George Karypis

The success of large neural networks is crucially determined by the availability of data.

Large Language Models of Code Fail at Completing Code with Potential Bugs

1 code implementation NeurIPS 2023 Tuan Dinh, Jinman Zhao, Samson Tan, Renato Negrinho, Leonard Lausen, Sheng Zha, George Karypis

We find that the presence of potential bugs significantly degrades the generation performance of the high-performing Code-LLMs.

Code Completion

Better Context Makes Better Code Language Models: A Case Study on Function Call Argument Completion

no code implementations1 Jun 2023 Hengzhi Pei, Jinman Zhao, Leonard Lausen, Sheng Zha, George Karypis

To better solve this task, we query a program analyzer for information relevant to a given function call, and consider ways to provide the analyzer results to different code completion models during inference and training.

Code Completion Program Synthesis

Parameter and Data Efficient Continual Pre-training for Robustness to Dialectal Variance in Arabic

no code implementations8 Nov 2022 Soumajyoti Sarkar, Kaixiang Lin, Sailik Sengupta, Leonard Lausen, Sheng Zha, Saab Mansour

While prior research studies have tried to adapt these multilingual models for dialectal variants of Arabic, it still remains a challenging problem owing to the lack of sufficient monolingual dialectal data and parallel translation data of such dialectal variants.

Avg Language Modelling +1

Differentially Private Optimization on Large Model at Small Cost

2 code implementations30 Sep 2022 Zhiqi Bu, Yu-Xiang Wang, Sheng Zha, George Karypis

Our implementation achieves state-of-the-art (SOTA) accuracy with very small extra cost: on GPT2 and at almost the same memory cost (<1% overhead), BK has 1. 03X the time complexity of the standard training (0. 83X training speed in practice), and 0. 61X the time complexity of the most efficient DP implementation (1. 36X training speed in practice).

Privacy Preserving

Differentially Private Bias-Term only Fine-tuning of Foundation Models

1 code implementation30 Sep 2022 Zhiqi Bu, Yu-Xiang Wang, Sheng Zha, George Karypis

We study the problem of differentially private (DP) fine-tuning of large pre-trained models -- a recent privacy-preserving approach suitable for solving downstream tasks with sensitive data.

Privacy Preserving

Exploring the Role of Task Transferability in Large-Scale Multi-Task Learning

no code implementations NAACL 2022 Vishakh Padmakumar, Leonard Lausen, Miguel Ballesteros, Sheng Zha, He He, George Karypis

Recent work has found that multi-task training with a large number of diverse tasks can uniformly improve downstream performance on unseen target tasks.

Multi-Task Learning Representation Learning

Accelerated Large Batch Optimization of BERT Pretraining in 54 minutes

1 code implementation24 Jun 2020 Shuai Zheng, Haibin Lin, Sheng Zha, Mu Li

Using the proposed LANS method and the learning rate scheme, we scaled up the mini-batch sizes to 96K and 33K in phases 1 and 2 of BERT pretraining, respectively.

Natural Language Understanding

GluonCV and GluonNLP: Deep Learning in Computer Vision and Natural Language Processing

4 code implementations9 Jul 2019 Jian Guo, He He, Tong He, Leonard Lausen, Mu Li, Haibin Lin, Xingjian Shi, Chenguang Wang, Junyuan Xie, Sheng Zha, Aston Zhang, Hang Zhang, Zhi Zhang, Zhongyue Zhang, Shuai Zheng, Yi Zhu

We present GluonCV and GluonNLP, the deep learning toolkits for computer vision and natural language processing based on Apache MXNet (incubating).

Dynamic Mini-batch SGD for Elastic Distributed Training: Learning in the Limbo of Resources

2 code implementations26 Apr 2019 Haibin Lin, Hang Zhang, Yifei Ma, Tong He, Zhi Zhang, Sheng Zha, Mu Li

One difficulty we observe is that the noise in the stochastic momentum estimation is accumulated over time and will have delayed effects when the batch size changes.

Image Classification object-detection +3

Question Type Guided Attention in Visual Question Answering

no code implementations ECCV 2018 Yang Shi, Tommaso Furlanello, Sheng Zha, Animashree Anandkumar

Visual Question Answering (VQA) requires integration of feature maps with drastically different structures and focus of the correct regions.

Activity Recognition Question Answering +2

Cannot find the paper you are looking for? You can Submit a new open access paper.