Search Results for author: Yixiao Huang

Found 2 papers, 0 papers with code

Mechanics of Next Token Prediction with Self-Attention

no code implementations • 12 Mar 2024 • Yingcong Li, Yixiao Huang, M. Emrullah Ildiz, Ankit Singh Rawat, Samet Oymak

}$ We show that training self-attention with gradient descent learns an automaton which generates the next token in two distinct steps: $\textbf{(1)}$ $\textbf{Hard}$ $\textbf{retrieval:}$ Given input sequence, self-attention precisely selects the $\textit{high-priority}$ $\textit{input}$ $\textit{tokens}$ associated with the last input token.

Retrieval

Paper
Add Code

From Self-Attention to Markov Models: Unveiling the Dynamics of Generative Transformers

no code implementations • 21 Feb 2024 • M. Emrullah Ildiz, Yixiao Huang, Yingcong Li, Ankit Singh Rawat, Samet Oymak

Modern language models rely on the transformer architecture and attention mechanism to perform language understanding and text generation.

Text Generation

Paper
Add Code

Cannot find the paper you are looking for? You can Submit a new open access paper.