1 code implementation • 25 Apr 2024 • Stephen Bothwell, Brian DuSell, David Chiang, Brian Krostenko
To assist historical linguists in the study of Italic sound change, we introduce the Proto-Italic to Latin (PILA) dataset, which consists of roughly 3, 000 pairs of forms from Proto-Italic and Latin.
1 code implementation • 3 Oct 2023 • Brian DuSell, David Chiang
Attention, specifically scaled dot-product attention, has proven effective for natural language, but it does not have a mechanism for handling hierarchical patterns of arbitrary nesting depth, which limits its ability to recognize certain syntactic structures.
no code implementations • 25 Apr 2023 • Brian DuSell
Human language is full of compositional syntactic structures, and although neural networks have contributed to groundbreaking improvements in computer systems that process language, widely-used neural network architectures still exhibit limitations in their ability to process syntax.
1 code implementation • 13 Oct 2022 • Alexandra Butoi, Brian DuSell, Tim Vieira, Ryan Cotterell, David Chiang
Weighted pushdown automata (WPDAs) are at the core of many natural language processing tasks, like syntax-based statistical machine translation and transition-based dependency parsing.
2 code implementations • 4 Oct 2022 • Brian DuSell, David Chiang
Second, it can recognize languages with much larger alphabet sizes than one might expect given the size of its stack alphabet.
1 code implementation • ICLR 2022 • Brian DuSell, David Chiang
Learning hierarchical structures in sequential data -- from simple algorithmic patterns to natural language -- in a reliable, generalizable way remains a challenging problem for neural language models.
1 code implementation • CONLL 2020 • Brian DuSell, David Chiang
We present a differentiable stack data structure that simultaneously and tractably encodes an exponential number of stack configurations, based on Lang's algorithm for simulating nondeterministic pushdown automata.
no code implementations • WS 2019 • Kenton Murray, Brian DuSell, David Chiang
We investigated the impact of auto-sizing (Murray and Chiang, 2015; Murray et al., 2019) to the Transformer network (Vaswani et al., 2017) with the goal of substantially reducing the number of parameters in the model.