Search Results for author: Yixiao Huang

Found 2 papers, 0 papers with code

Mechanics of Next Token Prediction with Self-Attention

no code implementations12 Mar 2024 Yingcong Li, Yixiao Huang, M. Emrullah Ildiz, Ankit Singh Rawat, Samet Oymak

}$ We show that training self-attention with gradient descent learns an automaton which generates the next token in two distinct steps: $\textbf{(1)}$ $\textbf{Hard}$ $\textbf{retrieval:}$ Given input sequence, self-attention precisely selects the $\textit{high-priority}$ $\textit{input}$ $\textit{tokens}$ associated with the last input token.

Retrieval

From Self-Attention to Markov Models: Unveiling the Dynamics of Generative Transformers

no code implementations21 Feb 2024 M. Emrullah Ildiz, Yixiao Huang, Yingcong Li, Ankit Singh Rawat, Samet Oymak

Modern language models rely on the transformer architecture and attention mechanism to perform language understanding and text generation.

Text Generation

Cannot find the paper you are looking for? You can Submit a new open access paper.