Search Results for author: Jerry Yao-Chieh Hu

Found 20 papers, 7 papers with code

AlignAb: Pareto-Optimal Energy Alignment for Designing Nature-Like Antibodies

no code implementations30 Dec 2024 Yibo Wen, Chenwei Xu, Jerry Yao-Chieh Hu, Han Liu

We present a three-stage framework for training deep learning models specializing in antibody sequence-structure co-design.

Language Modeling Language Modelling

On Statistical Rates of Conditional Diffusion Transformers: Approximation, Estimation and Minimax Optimality

no code implementations26 Nov 2024 Jerry Yao-Chieh Hu, Weimin Wu, Yi-Chen Lee, Yu-Chao Huang, Minshuo Chen, Han Liu

We investigate the approximation and estimation rates of conditional diffusion transformers (DiTs) with classifier-free guidance.

Fundamental Limits of Prompt Tuning Transformers: Universality, Capacity and Efficiency

no code implementations25 Nov 2024 Jerry Yao-Chieh Hu, Wei-Po Wang, Ammar Gilani, Chenyang Li, Zhao Song, Han Liu

Our key contributions are prompt tuning on \textit{single-head} transformers with only a \textit{single} self-attention layer: (i) is universal, and (ii) supports efficient (even almost-linear time) algorithms under the Strong Exponential Time Hypothesis (SETH).

Transformers are Deep Optimizers: Provable In-Context Learning for Deep Model Training

no code implementations25 Nov 2024 Weimin Wu, Maojiang Su, Jerry Yao-Chieh Hu, Zhao Song, Han Liu

We investigate the transformer's capability for in-context learning (ICL) to simulate the training process of deep models.

In-Context Learning

On Differentially Private String Distances

no code implementations8 Nov 2024 Jerry Yao-Chieh Hu, Erzhi Liu, Han Liu, Zhao Song, Lichen Zhang

Given a database of bit strings $A_1,\ldots, A_m\in \{0, 1\}^n$, a fundamental data structure task is to estimate the distances between a given query $B\in \{0, 1\}^n$ with all the strings in the database.

Differentially Private Kernel Density Estimation

no code implementations3 Sep 2024 Erzhi Liu, Jerry Yao-Chieh Hu, Alex Reneau, Zhao Song, Han Liu

In this paper, we improve the best previous result [Backurs, Lin, Mahabadi, Silwal, and Tarnawski, ICLR 2024] in three aspects: - We reduce query time by a factor of $\alpha^{-1} \log n$.

Density Estimation

On Statistical Rates and Provably Efficient Criteria of Latent Diffusion Transformers (DiTs)

no code implementations1 Jul 2024 Jerry Yao-Chieh Hu, Weimin Wu, Zhao Song, Han Liu

For backward computation, we leverage the low-rank structure within the gradient computation of DiTs training for possible algorithmic speedup.

Computational Efficiency

Computational Limits of Low-Rank Adaptation (LoRA) for Transformer-Based Models

no code implementations5 Jun 2024 Jerry Yao-Chieh Hu, Maojiang Su, En-Jui Kuo, Zhao Song, Han Liu

We study the computational limits of Low-Rank Adaptation (LoRA) update for finetuning transformer-based models using fine-grained complexity theory.

Decoupled Alignment for Robust Plug-and-Play Adaptation

no code implementations3 Jun 2024 Haozheng Luo, Jiahao Yu, Wenxin Zhang, Jialong Li, Jerry Yao-Chieh Hu, Xinyu Xing, Han Liu

We introduce a low-resource safety enhancement method for aligning large language models (LLMs) without the need for supervised fine-tuning (SFT) or reinforcement learning from human feedback (RLHF).

Knowledge Distillation

Enhancing Jailbreak Attack Against Large Language Models through Silent Tokens

no code implementations31 May 2024 Jiahao Yu, Haozheng Luo, Jerry Yao-Chieh Hu, Wenbo Guo, Han Liu, Xinyu Xing

Attackers carefully craft jailbreaking prompts such that a target LLM will respond to the harmful question.

Safety Alignment

Nonparametric Modern Hopfield Models

1 code implementation5 Apr 2024 Jerry Yao-Chieh Hu, Bo-Yu Chen, Dennis Wu, Feng Ruan, Han Liu

We present a nonparametric construction for deep learning compatible modern Hopfield models and utilize this framework to debut an efficient variant.

Outlier-Efficient Hopfield Layers for Large Transformer-Based Models

1 code implementation4 Apr 2024 Jerry Yao-Chieh Hu, Pei-Hsuan Chang, Robin Luo, Hong-Yu Chen, Weijian Li, Wei-Po Wang, Han Liu

We introduce an Outlier-Efficient Modern Hopfield Model (termed $\mathrm{OutEffHop}$) and use it to address the outlier inefficiency problem of {training} gigantic transformer-based models.

Benchmarking Quantization +1

BiSHop: Bi-Directional Cellular Learning for Tabular Data with Generalized Sparse Modern Hopfield Model

1 code implementation4 Apr 2024 Chenwei Xu, Yu-Chao Huang, Jerry Yao-Chieh Hu, Weijian Li, Ammar Gilani, Hsi-Sheng Goan, Han Liu

We introduce the \textbf{B}i-Directional \textbf{S}parse \textbf{Hop}field Network (\textbf{BiSHop}), a novel end-to-end framework for deep tabular learning.

Representation Learning

Uniform Memory Retrieval with Larger Capacity for Modern Hopfield Models

1 code implementation4 Apr 2024 Dennis Wu, Jerry Yao-Chieh Hu, Teng-Yun Hsiao, Han Liu

Specifically, we accomplish this by constructing a separation loss $\mathcal{L}_\Phi$ that separates the local minima of kernelized energy by separating stored memory patterns in kernel space.

Retrieval

On Computational Limits of Modern Hopfield Models: A Fine-Grained Complexity Analysis

no code implementations7 Feb 2024 Jerry Yao-Chieh Hu, Thomas Lin, Zhao Song, Han Liu

Specifically, we establish an upper bound criterion for the norm of input query patterns and memory patterns.

Retrieval

STanHop: Sparse Tandem Hopfield Model for Memory-Enhanced Time Series Prediction

1 code implementation28 Dec 2023 Dennis Wu, Jerry Yao-Chieh Hu, Weijian Li, Bo-Yu Chen, Han Liu

We present STanHop-Net (Sparse Tandem Hopfield Network) for multivariate time series prediction with memory-enhanced capabilities.

Retrieval Time Series +1

On Sparse Modern Hopfield Model

1 code implementation NeurIPS 2023 Jerry Yao-Chieh Hu, Donglin Yang, Dennis Wu, Chenwei Xu, Bo-Yu Chen, Han Liu

Building upon this, we derive the sparse memory retrieval dynamics from the sparse energy function and show its one-step approximation is equivalent to the sparse-structured attention.

model Retrieval

Feature Programming for Multivariate Time Series Prediction

1 code implementation9 Jun 2023 Alex Reneau, Jerry Yao-Chieh Hu, Chenwei Xu, Weijian Li, Ammar Gilani, Han Liu

We introduce the concept of programmable feature engineering for time series modeling and propose a feature programming framework.

Automated Feature Engineering Feature Engineering +4

Cannot find the paper you are looking for? You can Submit a new open access paper.