Search Results for author: Ting-Han Fan

Found 12 papers, 5 papers with code

Attention Alignment and Flexible Positional Embeddings Improve Transformer Length Extrapolation

no code implementations • 1 Nov 2023 • Ta-Chung Chi, Ting-Han Fan, Alexander I. Rudnicky

This suggests that a flexible positional embedding design and attention alignment can go a long way toward Transformer length extrapolation.

Code Completion Language Modelling +2

Paper
Add Code

Advancing Regular Language Reasoning in Linear Recurrent Neural Networks

1 code implementation • 14 Sep 2023 • Ting-Han Fan, Ta-Chung Chi, Alexander I. Rudnicky

In recent studies, linear recurrent neural networks (LRNNs) have achieved Transformer-level performance in natural language and long-range modeling, while offering rapid parallel training and constant inference cost.

Long-range modeling

Paper
Code

Latent Positional Information is in the Self-Attention Variance of Transformer Language Models Without Positional Embeddings

no code implementations • 23 May 2023 • Ta-Chung Chi, Ting-Han Fan, Li-Wei Chen, Alexander I. Rudnicky, Peter J. Ramadge

The use of positional embeddings in transformer language models is widely accepted.

Language Modelling

Paper
Add Code

Transformer Working Memory Enables Regular Language Reasoning and Natural Language Length Extrapolation

no code implementations • 5 May 2023 • Ta-Chung Chi, Ting-Han Fan, Alexander I. Rudnicky, Peter J. Ramadge

Unlike recurrent models, conventional wisdom has it that Transformers cannot perfectly model regular languages.

Paper
Add Code

Dissecting Transformer Length Extrapolation via the Lens of Receptive Field Analysis

no code implementations • 20 Dec 2022 • Ta-Chung Chi, Ting-Han Fan, Alexander I. Rudnicky, Peter J. Ramadge

Length extrapolation permits training a transformer language model on short sequences that preserves perplexities when tested on substantially longer sequences.

Language Modelling

Paper
Add Code

Training Discrete Deep Generative Models via Gapped Straight-Through Estimator

1 code implementation • 15 Jun 2022 • Ting-Han Fan, Ta-Chung Chi, Alexander I. Rudnicky, Peter J. Ramadge

While deep generative models have succeeded in image processing, natural language processing, and reinforcement learning, training that involves discrete random variables remains challenging due to the high variance of its gradient estimation process.

ListOps reinforcement-learning +1

Paper
Code

KERPLE: Kernelized Relative Positional Embedding for Length Extrapolation

2 code implementations • 20 May 2022 • Ta-Chung Chi, Ting-Han Fan, Peter J. Ramadge, Alexander I. Rudnicky

Relative positional embeddings (RPE) have received considerable attention since RPEs effectively model the relative distance among tokens and enable length extrapolation.

Language Modelling Position

6,567

Paper
Code

Explaining Off-Policy Actor-Critic From A Bias-Variance Perspective

1 code implementation • 6 Oct 2021 • Ting-Han Fan, Peter J. Ramadge

Off-policy Actor-Critic algorithms have demonstrated phenomenal experimental performance but still require better explanations.

Paper
Code

Soft Actor-Critic With Integer Actions

no code implementations • 17 Sep 2021 • Ting-Han Fan, YuBo Wang

Reinforcement learning is well-studied under discrete actions.

reinforcement-learning Reinforcement Learning (RL)

Paper
Add Code

PowerGym: A Reinforcement Learning Environment for Volt-Var Control in Power Distribution Systems

1 code implementation • 8 Sep 2021 • Ting-Han Fan, Xian Yeow Lee, YuBo Wang

We introduce PowerGym, an open-source reinforcement learning environment for Volt-Var control in power distribution systems.

OpenAI Gym reinforcement-learning +1

Paper
Code

A Contraction Approach to Model-based Reinforcement Learning

no code implementations • 18 Sep 2020 • Ting-Han Fan, Peter J. Ramadge

Despite its experimental success, Model-based Reinforcement Learning still lacks a complete theoretical understanding.

Imitation Learning Model-based Reinforcement Learning +2

Paper
Add Code

Model Imitation for Model-Based Reinforcement Learning

no code implementations • 25 Sep 2019 • Yueh-Hua Wu, Ting-Han Fan, Peter J. Ramadge, Hao Su

Based on the claim, we propose to learn the transition model by matching the distributions of multi-step rollouts sampled from the transition model and the real ones via WGAN.

Model-based Reinforcement Learning reinforcement-learning +1

Paper
Add Code

Cannot find the paper you are looking for? You can Submit a new open access paper.