Search Results for author: Andy Yang

Found 4 papers, 1 papers with code

Simulating Hard Attention Using Soft Attention

no code implementations13 Dec 2024 Andy Yang, Lena Strobl, David Chiang, Dana Angluin

Second, we demonstrate how temperature scaling allows softmax transformers to simulate a large subclass of average-hard attention transformers, those that have what we call the uniform-tieless property.

Hard Attention

A Formal Framework for Understanding Length Generalization in Transformers

1 code implementation3 Oct 2024 Xinting Huang, Andy Yang, Satwik Bhattamishra, Yash Sarrof, Andreas Krebs, Hattie Zhou, Preetum Nakkiran, Michael Hahn

A major challenge for transformers is generalizing to sequences longer than those observed during training.

Counting Like Transformers: Compiling Temporal Counting Logic Into Softmax Transformers

no code implementations5 Apr 2024 Andy Yang, David Chiang

Deriving formal bounds on the expressivity of transformers, as well as studying transformers that are constructed to implement known algorithms, are both effective methods for better understanding the computational power of transformers.

Masked Hard-Attention Transformers Recognize Exactly the Star-Free Languages

no code implementations21 Oct 2023 Andy Yang, David Chiang, Dana Angluin

The expressive power of transformers over inputs of unbounded size can be studied through their ability to recognize classes of formal languages.

Hard Attention Position

Cannot find the paper you are looking for? You can Submit a new open access paper.