Search Results for author: Dan Garrette

Found 19 papers, 8 papers with code

Frequency Effects on Syntactic Rule Learning in Transformers

1 code implementation EMNLP 2021 Jason Wei, Dan Garrette, Tal Linzen, Ellie Pavlick

Pre-trained language models perform well on a variety of linguistic tasks that require symbolic reasoning, raising the question of whether such models implicitly represent abstract symbols and rules.

CANINE: Pre-training an Efficient Tokenization-Free Encoder for Language Representation

3 code implementations11 Mar 2021 Jonathan H. Clark, Dan Garrette, Iulia Turc, John Wieting

Pipelined NLP systems have largely been superseded by end-to-end neural modeling, yet nearly all commonly-used models still require an explicit tokenization step.

Improving Multilingual Models with Language-Clustered Vocabularies

no code implementations EMNLP 2020 Hyung Won Chung, Dan Garrette, Kiat Chuan Tan, Jason Riesa

State-of-the-art multilingual models depend on vocabularies that cover all of the languages the model will expect to see at inference time, but the standard methods for generating those vocabularies are not ideal for massively multilingual applications.

NER

How multilingual is Multilingual BERT?

2 code implementations ACL 2019 Telmo Pires, Eva Schlinger, Dan Garrette

In this paper, we show that Multilingual BERT (M-BERT), released by Devlin et al. (2018) as a single language model pre-trained from monolingual corpora in 104 languages, is surprisingly good at zero-shot cross-lingual model transfer, in which task-specific annotations in one language are used to fine-tune the model for evaluation in another language.

Language Modelling Translation

Part-of-Speech Tagging for Code-Switched, Transliterated Texts without Explicit Language Identification

no code implementations EMNLP 2018 Kelsey Ball, Dan Garrette

Code-switching, the use of more than one language within a single utterance, is ubiquitous in much of the world, but remains a challenge for NLP largely due to the lack of representative data for training models.

Language Identification Part-Of-Speech Tagging +2

Automatic Compositor Attribution in the First Folio of Shakespeare

no code implementations ACL 2017 Maria Ryskina, Hannah Alpert-Abrams, Dan Garrette, Taylor Berg-Kirkpatrick

Compositor attribution, the clustering of pages in a historical printed document by the individual who set the type, is a bibliographic task that relies on analysis of orthographic variation and inspection of visual details of the printed page.

Optical Character Recognition

DyNet: The Dynamic Neural Network Toolkit

4 code implementations15 Jan 2017 Graham Neubig, Chris Dyer, Yoav Goldberg, Austin Matthews, Waleed Ammar, Antonios Anastasopoulos, Miguel Ballesteros, David Chiang, Daniel Clothiaux, Trevor Cohn, Kevin Duh, Manaal Faruqui, Cynthia Gan, Dan Garrette, Yangfeng Ji, Lingpeng Kong, Adhiguna Kuncoro, Gaurav Kumar, Chaitanya Malaviya, Paul Michel, Yusuke Oda, Matthew Richardson, Naomi Saphra, Swabha Swayamdipta, Pengcheng Yin

In the static declaration strategy that is used in toolkits like Theano, CNTK, and TensorFlow, the user first defines a computation graph (a symbolic representation of the computation), and then examples are fed into an engine that executes this computation and computes its derivatives.

graph construction

Cannot find the paper you are looking for? You can Submit a new open access paper.