Search Results for author: Xingwu Chen

Found 1 papers, 0 papers with code

What Can Transformer Learn with Varying Depth? Case Studies on Sequence Learning Tasks

no code implementations2 Apr 2024 Xingwu Chen, Difan Zou

Specifically, we designed a novel set of sequence learning tasks to systematically evaluate and comprehend how the depth of transformer affects its ability to perform memorization, reasoning, generalization, and contextual generalization.

Memorization

Cannot find the paper you are looking for? You can Submit a new open access paper.