Search Results for author: Phillip Rust

Found 8 papers, 3 papers with code

Towards Privacy-Aware Sign Language Translation at Scale

no code implementations • 14 Feb 2024 • Phillip Rust, Bowen Shi, Skyler Wang, Necati Cihan Camgöz, Jean Maillard

A major impediment to the advancement of sign language translation (SLT) is data scarcity.

Paper
Add Code

Text Rendering Strategies for Pixel Language Models

no code implementations • 1 Nov 2023 • Jonas F. Lotz, Elizabeth Salesky, Phillip Rust, Desmond Elliott

Pixel-based language models process text rendered as images, which allows them to handle any script, making them a promising approach to open vocabulary language modelling.

Language Modelling Sentence

Paper
Add Code

PHD: Pixel-Based Language Modeling of Historical Documents

1 code implementation • 22 Oct 2023 • Nadav Borenstein, Phillip Rust, Desmond Elliott, Isabelle Augenstein

We then pre-train our model, PHD, on a combination of synthetic scans and real historical newspapers from the 1700-1900 period.

Language Modelling Optical Character Recognition (OCR)

Paper
Code

Differential Privacy, Linguistic Fairness, and Training Data Influence: Impossibility and Possibility Theorems for Multilingual Language Models

no code implementations • 17 Aug 2023 • Phillip Rust, Anders Søgaard

Language models such as mBERT, XLM-R, and BLOOM aim to achieve multilingual generalization or compression to facilitate transfer to a large number of (potentially unseen) languages.

Fairness XLM-R

Paper
Add Code

Language Modelling with Pixels

1 code implementation • 14 Jul 2022 • Phillip Rust, Jonas F. Lotz, Emanuele Bugliarello, Elizabeth Salesky, Miryam de Lhoneux, Desmond Elliott

We pretrain the 86M parameter PIXEL model on the same English data as BERT and evaluate on syntactic and semantic tasks in typologically diverse languages, including various non-Latin scripts.

Ranked #1 on Named Entity Recognition (NER) on MasakhaNER

Language Modelling Named Entity Recognition (NER)

321

Paper
Code

Challenges and Strategies in Cross-Cultural NLP

no code implementations • ACL 2022 • Daniel Hershcovich, Stella Frank, Heather Lent, Miryam de Lhoneux, Mostafa Abdou, Stephanie Brandl, Emanuele Bugliarello, Laura Cabello Piqueras, Ilias Chalkidis, Ruixiang Cui, Constanza Fierro, Katerina Margatina, Phillip Rust, Anders Søgaard

Various efforts in the Natural Language Processing (NLP) community have been made to accommodate linguistic diversity and serve speakers of many different languages.

Cultural Vocal Bursts Intensity Prediction Multilingual NLP

Paper
Add Code

How Good is Your Tokenizer? On the Monolingual Performance of Multilingual Language Models

1 code implementation • ACL 2021 • Phillip Rust, Jonas Pfeiffer, Ivan Vulić, Sebastian Ruder, Iryna Gurevych

In this work, we provide a systematic and comprehensive empirical comparison of pretrained multilingual language models versus their monolingual counterparts with regard to their monolingual task performance.

Pretrained Multilingual Language Models

Paper
Code

PuzzLing Machines: A Challenge on Learning From Small Data

no code implementations • ACL 2020 • Gözde Gül Şahin, Yova Kementchedjhieva, Phillip Rust, Iryna Gurevych

To expose this problem in a new light, we introduce a challenge on learning from small data, PuzzLing Machines, which consists of Rosetta Stone puzzles from Linguistic Olympiads for high school students.

Small Data Image Classification

Paper
Add Code

Cannot find the paper you are looking for? You can Submit a new open access paper.