no code implementations • 2 Jun 2024 • Yihan Wu, Ruibo Chen, Zhengmian Hu, Yanshuo Chen, Junfeng Guo, Hongyang Zhang, Heng Huang
Experimental results support that the beta-watermark can effectively reduce the distribution bias under key collisions.
no code implementations • 20 Nov 2023 • Zhengmian Hu, Gang Wu, Saayan Mitra, Ruiyi Zhang, Tong Sun, Heng Huang, Viswanathan Swaminathan
Our work aims to address this concern by introducing a novel approach to detecting adversarial prompts at a token level, leveraging the LLM's capability to predict the next token's probability.
no code implementations • 27 Oct 2023 • Ruibo Chen, Tianyi Xiong, Yihan Wu, Guodong Liu, Zhengmian Hu, Lichang Chen, Yanshuo Chen, Chenxi Liu, Heng Huang
This technical report delves into the application of GPT-4 Vision (GPT-4V) in the nuanced realm of COVID-19 image classification, leveraging the transformative potential of in-context learning to enhance diagnostic processes.
1 code implementation • 11 Oct 2023 • Yihan Wu, Zhengmian Hu, Junfeng Guo, Hongyang Zhang, Heng Huang
Watermarking techniques offer a promising way to identify machine-generated content via embedding covert information into the contents generated from language models.
1 code implementation • NeurIPS 2023 • Xidong Wu, Jianhui Sun, Zhengmian Hu, Aidong Zhang, Heng Huang
We propose FL algorithms (FedSGDA+ and FedSGDA-M) and reduce existing complexity results for the most common minimax problems.
1 code implementation • 6 Aug 2023 • Xidong Wu, Zhengmian Hu, Jian Pei, Heng Huang
To address the above challenge, we study the serverless multi-party collaborative AUPRC maximization problem since serverless multi-party collaborative training can cut down the communications cost by avoiding the server node bottleneck, and reformulate it as a conditional stochastic optimization problem in a serverless multi-party collaborative learning setting and propose a new ServerLess biAsed sTochastic gradiEnt (SLATE) algorithm to directly optimize the AUPRC.
no code implementations • 8 Feb 2023 • Xidong Wu, Zhengmian Hu, Heng Huang
The minimax optimization over Riemannian manifolds (possibly nonconvex constraints) has been actively applied to solve many problems, such as robust dimensionality reduction and deep neural networks with orthogonal weights (Stiefel manifold).
no code implementations • 2 Dec 2022 • Xidong Wu, Feihu Huang, Zhengmian Hu, Heng Huang
Federated learning has attracted increasing attention with the emergence of distributed data.
no code implementations • NeurIPS 2021 • Zhengmian Hu, Feihu Huang, Heng Huang
In the paper, we study the underdamped Langevin diffusion (ULD) with strongly-convex potential consisting of finite summation of $N$ smooth components, and propose an efficient discretization method, which requires $O(N+d^\frac{1}{3}N^\frac{2}{3}/\varepsilon^\frac{2}{3})$ gradient evaluations to achieve $\varepsilon$-error (in $\sqrt{\mathbb{E}{\lVert{\cdot}\rVert_2^2}}$ distance) for approximating $d$-dimensional ULD.
1 code implementation • 21 Jul 2021 • Huimin Wu, Zhengmian Hu, Bin Gu
Although a wide range of researches have been done in recent years to improve the adversarial robustness of learning models, but most of them are limited to deep neural networks (DNNs) and the work for kernel SVM is still vacant.
no code implementations • 30 Jun 2021 • Feihu Huang, Xidong Wu, Zhengmian Hu
Specifically, we propose a fast Adaptive Gradient Descent Ascent (AdaGDA) method based on the basic momentum technique, which reaches a lower gradient complexity of $\tilde{O}(\kappa^4\epsilon^{-4})$ for finding an $\epsilon$-stationary point without large batches, which improves the existing results of the adaptive GDA methods by a factor of $O(\sqrt{\kappa})$.
no code implementations • 9 Feb 2021 • Zhengmian Hu, Feihu Huang, Heng Huang
Moreover, our HMC methods with biased gradient estimators, such as SARAH and SARGE, require $\tilde{O}(N+\sqrt{N} \kappa^2 d^{\frac{1}{2}} \varepsilon^{-1})$ gradient complexity, which has the same dependency on condition number $\kappa$ and dimension $d$ as full gradient method, but improves the dependency of sample size $N$ for a factor of $N^\frac{1}{2}$.