Search Results for author: Phillip Rust

Found 8 papers, 3 papers with code

Text Rendering Strategies for Pixel Language Models

no code implementations1 Nov 2023 Jonas F. Lotz, Elizabeth Salesky, Phillip Rust, Desmond Elliott

Pixel-based language models process text rendered as images, which allows them to handle any script, making them a promising approach to open vocabulary language modelling.

Language Modelling Sentence

PHD: Pixel-Based Language Modeling of Historical Documents

1 code implementation22 Oct 2023 Nadav Borenstein, Phillip Rust, Desmond Elliott, Isabelle Augenstein

We then pre-train our model, PHD, on a combination of synthetic scans and real historical newspapers from the 1700-1900 period.

Language Modelling Optical Character Recognition (OCR)

Differential Privacy, Linguistic Fairness, and Training Data Influence: Impossibility and Possibility Theorems for Multilingual Language Models

no code implementations17 Aug 2023 Phillip Rust, Anders Søgaard

Language models such as mBERT, XLM-R, and BLOOM aim to achieve multilingual generalization or compression to facilitate transfer to a large number of (potentially unseen) languages.

Fairness XLM-R

Language Modelling with Pixels

1 code implementation14 Jul 2022 Phillip Rust, Jonas F. Lotz, Emanuele Bugliarello, Elizabeth Salesky, Miryam de Lhoneux, Desmond Elliott

We pretrain the 86M parameter PIXEL model on the same English data as BERT and evaluate on syntactic and semantic tasks in typologically diverse languages, including various non-Latin scripts.

Language Modelling Named Entity Recognition (NER)

How Good is Your Tokenizer? On the Monolingual Performance of Multilingual Language Models

1 code implementation ACL 2021 Phillip Rust, Jonas Pfeiffer, Ivan Vulić, Sebastian Ruder, Iryna Gurevych

In this work, we provide a systematic and comprehensive empirical comparison of pretrained multilingual language models versus their monolingual counterparts with regard to their monolingual task performance.

Pretrained Multilingual Language Models

PuzzLing Machines: A Challenge on Learning From Small Data

no code implementations ACL 2020 Gözde Gül Şahin, Yova Kementchedjhieva, Phillip Rust, Iryna Gurevych

To expose this problem in a new light, we introduce a challenge on learning from small data, PuzzLing Machines, which consists of Rosetta Stone puzzles from Linguistic Olympiads for high school students.

Small Data Image Classification

Cannot find the paper you are looking for? You can Submit a new open access paper.